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Intel Corporation is a leading supplier of microcomputer components, 

modules and systems. When Intel invented the microprocessor in 1977, it 
created the era of the microcomputer. Today, Intel architectures are considered 
world standards. Whether used in embedded applications such as automobiles, 
printers and microwave ovens, or as the CPU in personal computers, client 
servers or supercomputers, Intel delivers leading-edge technology. 
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About Our Cover: 


Thinkers, inventors, and artists throughout history have breathed 

life into their ideas by converting them into rough working sketches, models, 
and products. This series of covers shows a few of these creations, along 
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CUSTOMER SUPPORT 


INTEL’S COMPLETE SUPPORT SOLUTION WORLDWIDE 


Customer Support is Intel’s complete support service that provides Intel customers with hardware support, 
software support, customer training, consulting services and network management services. For detailed infor- 
mation contact your local sales offices. 


After a customer purchases any system hardware or software product, service and support become major 
factors in determining whether that product will continue to meet a customer’s expectations. Such support 
requires an international support organization and a breadth of programs to meet a variety of customer needs. 
As you might expect, Intel’s customer support is extensive. It can start with assistance during your development 
effort to network management. 100 Intel sales and service offices are located worldwide — in the U.S., Canada, 
Europe and the Far East. So wherever you’re using Intel technology, our professional staff is within close 
reach. 


HARDWARE SUPPORT SERVICES 


Intel’s hardware maintenance service, starting with complete on-site installation will boost your productivity 
from the start and keep you running at maximum efficiency. Support for system or board level products can be 
tailored to match your needs, from complete on-site repair and maintenance support to economical carry-in or 
mail-in factory service. 


Intel can provide support service for not only Intel systems and emulators, but also support for equipment in 
your development lab or provide service on your product to your end-user/customer. 


SOFTWARE SUPPORT SERVICES 


Software products are supported by our Technical Information Service (TIPS) that has a special toll free 
number to provide you with direct, ready information on known, documented problems and deficiencies, as 
well as work-arounds, patches and other solutions. 


Intel’s software support consists of two levels of contracts. Standard support includes TIPS (Technical Infor- 
mation Phone Service), updates and subscription service (product-specific troubleshooting guides and; 
COMMENTS Magazine). Basic support consists of updates and the subscription service. Contracts are sold in 
environments which represent product groupings (e.g., i1RMX® environment). 


NETWORK SERVICE AND SUPPORT 


Today’s broad spectrum of powerful networking capabilities are only as good as the customer support provided 
by the vendor. Intel offers network services and support structured to meet a wide variety of end-user comput- 
ing needs. From a ground up design of your network’s physical and logical design to implementation, installa- 
tion and network wide maintenance. From software products to turn-key system solutions; Intel offers the 
customer a complete networked solution. With over 10 years of network experience in both the commercial 
and Government arena; network products, services and support from Intel provide you the most optimized 
network offering in the industry. 


CONSULTING SERVICES 


Intel provides field system engineering consulting services for any phase of your development or application 
effort. You can use our system engineers in a variety of ways ranging from assistance in using a new product, 
developing an application, personalizing training and customizing an Intel product to providing technical and 
management consulting. Systems Engineers are well versed in technical areas such as microcommunications, 
real-time applications, embedded microcontrollers, and network services. You know your application needs; 
we know our products. Working together we can help you get a successful product to market in the least 
possible time. 


CUSTOMER TRAINING 


Intel offers a wide range of instructional programs covering various aspects of system design and implementa- 
tion. In just three to ten days a limited number of individuals learn more in a single workshop than in weeks of 
self-study. For optimum convenience, workshops are scheduled regularly at Training Centers worldwide or we 
can take our workshops to you for on-site instruction. Covering a wide variety of topics, Intel’s major course 
categories include: architecture and assembly language, programming and operating systems, BITBUS™ and 
LAN applications. 
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DATA SHEET DESIGNATIONS 


Intel uses various data sheet markings to designate each phase of the document as it 
relates to the product. The marking appears in the upper, right-hand corner of the data 
sheet. The following is the definition of these markings: 


Data Sheet Marking. | _f Description | 


Product Preview — Contains information on products in the design phase of 
| | development. Do not finalize a design with this 
information. Revised information will be ey when 

the product becomes available. 


Advanced Information Contains information on products being sampled or in 
| the initial production phase of development.* 
Preliminary : Contains preliminary information on new products in 
production.*. 
No Marking | Contains information on products in full production. 


*Specifications within these data sheets are subject to change without notice. Verify with your local Intel sales 
office that you have the latest data sheet before finalizing a design. 


Overview 


8086 Microprocessor Family 


80286 Microprocessor Family 


Development Tools for the 
8086, 80186, 80188, and 
80286 


Intel3861™ Family 


i860™ Microprocessor Family 


i7501M Video Processor Family 


Development Tools for the 
80386 and 80486 


Table of Contents 


AlphanumenG Index 22. cdec4ned2u2ectes iene cee dbase batwhe ore cheederg lease 


CHAPTER 1 


Overview 
LATE OGIO IOI is ogre -fo eoesere cose Sees a cee pret See a a he EO, Rea ee 


CHAPTER 2 


8086 Microprocessor Family 
DATA SHEETS 
8086 16-Bit HMOS Microprocessor 8086/8086-2/8086-1* .................008- 
80C86A 16-Bit CHMOS Microprocessor ............ 02. e ce cee cece erence eens 
8088 8-Bit HMOS Microprocessor 8088/8088-2 ........... 2. ccc ce cee eens 
80C88A 8-Bit CHMOS Microprocessor ......... 0... ccc cee cece eee e ee eeeeeee 
8087 Mat): CODIOCESSOF s<.3.25 S25 te cent oe ne ears ea eee oe ese Wms 


CHAPTER 3 


80286 Microprocessor Family 
DATA SHEETS 

80C286 High Performance Microprocessor with Memory Management and 
PIOLSCHON sess ey sae bad eo is Oo eee On Vee OG REE te ber sas 

80286 High Performance Microprocessor with Memory Management and 
PIOLOCHON co is oc cea et eae ea OEE eae tee Ok eee eee eee eeeaaeas 

80287XL/XLT CHMOS Ill Math CoProcessor .............. cece ee ee ce eee ences 

82C288 Bus Controller for 80286 Processors (82C288-12, 82C288-10, 82C288-8) . 

82C284 Clock Generator and Ready Interface for 80286 Processors (82C284-12, 
82628410, 8202848) cde hoe o eb od oe OR Ba te ORe he GES Wee ooo odd 


CHAPTER 4 


Development Tools for the 8086, 80186, 80188, and 80286 

LANGUAGES AND SOFTWARE DEVELOPMENT TOOLS 
8086/80186 Software Development Packages............... eee cece ee eee eee 
iG-86/286 C COMPpIIGR 62.2. aitnnieca tat acae woes eee nie a alattsnere bone ned etal 
AEDIT Source Code and Text Editor ........... ccc cece cece eee eee n eens 
iPAT Performance Analysis Tool ............ ccc cece eee ee eee etree teen eens 

IN-CIRCUIT EMULATORS 
I2ICE In-Circuit Emulation System ........ 0... ccc ccc ccc ee eee eee eens 
ICE-186 and ICE-188 In-Circuit Emulators ......... 0... cece eee eee eee 
ICE-186EB and ICE-188EB In-Circuit Emulators ......... 0.0... cee eee eee eee 
IGE-286 (n-GIrculn EMUIAlOE 640k ous ciate eewew eredict eluate tee ice aeun ns 


CHAPTER 5 


INTEL386™ Family 
DATA SHEETS 
1486 ‘MIGIODIGCESSOl saeco shite h hea adarteeeeredaw saben eee aie seu andon 
485Turbocache Module i486 Microprocessor Cache Upgrade..................- 
82485 Second Level Cache Controller for the i486 Microprocessor .............. 
AP-447 A Memory Subsystem for the i486 CPU Including Second Level Cache .... 
386 DX Microprocessor High Performance 32-Bit CHMOS Microprocessor with 
Integrated Memory Management ...........--....+-se eee, sini eiopanammacnae toe 
387 OX Math): GODIOCeSSOR ‘4. 600.4055 dare van sha ee ed ee oe Ra genni e sean aswe 
82395DX High Performance 386 Smart Cache ............ 0... cece eee eee ee eee 
82385 High Performance 32-Bit Cache Controller ........ SEeaSdsedeeoateessees 
AP-442 33 MHz 386 System Design Considerations ................. eee eee eee 
386 SL Microprocessor SuperSet ......... 0... cece cece cece eee eee eee eee eeees 
380: OX MICKOPlOCESSOM cc Scaece dar warn eal 4a Stereo eb ae etnies oe nacaee im Ge wee es 
367 OX. Math GODIOCESSOM aa 24.8 Sena shwiethu hehe eens cheba eekeweeeseeens 


xi 


1-1 


Table of Contents (Continued) 


- 82395SX 386 SX Smart Cache..............00. seis teeensadanconce see. 51002 
—82385SX High Performance Cache Controller......... 00... cece cece eee eee eens 85-1003 

- 82380 High Performance 32-Bit DMA Controller with Integrated System Support 
PONDS 34.43.26 e tiation hoe eae een de weens eae beeen eae . 5-1080 
376 High Performance 32-Bit Embedded Processor .............. ery 5-1217 
82370 Integrated System Peripheral .............. ccc cee lee e cece eee e ee ccees D1GI12 

CHAPTER 6 a | | 
i860™ Microprocessor Family — | | 
1860 64-Bil MICIODIOCCSSON eo ed hea Feoaa asa ie eh oath ead hea Se hOe SSeS 6-1 
AP-434 Using i860 Microprocessor Graphics Instructions for 3-D Rendering reer 6-81 
| AP-435 Fast Fourier Transforms on the i860 Microprocessor...............005 .. 6-96 
CHAPTER 7. 


i750™ Video Processor Family 
82750PB Pixel Processor ............. cece eee eee eee ed Guise ea aeaues 


CHAPTER 8 


Development Tools for the 80386 and 80486 
LANGUAGES AND SOFTWARE DEVELOPMENT TOOLS 


Intel386/i486 Family Development Support.................... ere or «BT 
Intel 376 Family Development Support ...................0.06- hitele & eesteaeeg ee tenare 8-11 
~ ICD-486 In-Circuit Debugger ..... ieee edits biceid ak Pua aeaane yada aaa aoe 8-23 
IN-CIRCUIT EMULATORS - : a } 
Intel386 Family of In-Circuit Emulators ..................05. Abus ais bse Sts ao 8-29 
Intel i486 In-Circuit Emulator x ia:csre wid ccne vawewte ae wew salnaiees Geee viele aiiersieaas 8-55 


Alphanumeric Index 


376 High Performance 32-Bit Embedded Processor ................. ccc eee eee ee eens 5-1217 
386 DX Microprocessor High Performance.32-Bit CHMOS Microprocessor with Integrated 

Memory Managemen « iccé2ane0544 So dsance nce ng ote ened e8hieobe os ncGolas d4a8e4 5-287 
386 SL Microprocessor SuperSet ......... 0... cc een een eee eee rene eens 5-731 
386'OX% MICIODIOCESSON » s-ce die irc aettend uae heed Pee Se Os Meee eee es 5-864 
387 DX Math) GoproCessoOr cai aie i sak oi ul we on Bahay ea oe me Rane 1 beens 5-425 
387 SX. Mall CODIOCCSSOl esis elas tame aney ae eee vee eka wa doe goed oom es 5-962 
485Turbocache Module i486 Microprocessor Cache Upgrade ...................20008- 5-177 
80286 High Performance Microprocessor with Memory Management and Protection...... 3-60 
80287XL/XLT CHMOS Ill Math CoProcessor ........... 0. cece cece eet eee teenies 3-116 
8086 16-Bit HMOS Microprocessor 8086/8086-2/8086-1* ....... 0... cece eee nes 2-1 
8086/80186 Software Development Packages ............. ccc ccc ccc cent eee n eee eeee 4-1 
S087 Math) COPLOCCSSOl 0.6 icon aa ene ears Rada ake ROE how cae Ne deeueneete saws 2-122 
8088 8-Bit HMOS Microprocessor 8088/8088-2 ........ 0... ccc ccc ene etnies 2-60 
80C286 High Performance Microprocessor with Memory Management and Protection .... 3-1 
80C86A 16-Bit CHMOS Microprocessor ........ 0.0... ccc ce cece tent ete nees 2-31 
80C88A 8-Bit CHMOS Microprocessor... ...... 0.0 ccc cece eet teen teen ee eens 2-90 
82370 Integrated System Peripheral... .........0 0... cece cee eee ett ete teen eens 5-1312 
82380 High Performance 32-Bit DMA Controller with Integrated System Support 

POMDNOLAIS: cts ohos dieu Galea aoe araha Sehd ea oeelea ee 5 en as ee anes Adan Ae ene eae 5-1080 
82385 High Performance 32-Bit Cache Controller ........ 0... . cece eee ee cee eens 5-547 
82385SX High Performance Cache Controller .......... 0... cc cece cece cece cence eens 5-1003 
82395DX High Performance 386 Smart Cache............. ccc cece cece teen teenies 5-466 
B23959X 386 SX Smatl CaCheG: he ih hk etek Hed Swe heSEwe Hew be wee Rewer He 5-1002 
82485 Second Level Cache Controller for the i486 Microprocessor .................005- 5-206 
8275006 Display PIOCESSOR po-0 ooo Sows Garb Secure ha whew ea aa Danika eens 7-3 
62750P Pixel PlOCOSSOM oF isee5ci-ae Arie eee ahaa dynes boa aa tia Sa ae ais 7-1 
82C284 Clock Generator and Ready Interface for 80286 Processors (82C 284-12, 

826284210, 826 264-9)i a5 be cara awd wie ane tun, Bea aa eEnat Maa ee oh ee elma wnes 3-169 
82C288 Bus Controller for 80286 Processors (82C 288-12, 82C288-10, 82C288-8)........ 3-148 
AEDIT Source Code and Text Editor... 0.0.0... ccc ccc eect ence ene ee een eeas 4-12 
AP-434 Using i860 Microprocessor Graphics Instructions for 3-D Rendering ............. 6-81 
AP-435 Fast Fourier Transforms on the i860 Microprocessor ...............ccee eee eens 6-96 
AP-442 33 MHz 386 System Design Considerations ............ 0... cece ee eee eee eens 5-620 
AP-447 A Memory Subsystem for the i486 CPU Including Second Level Cache........... 5-207 
I2ICE In-Circuit Emulation System .. 0... 0. ce eee ee eee nent een ee eenee 4-18 
i466: MICIODIOCESSOF a5 43 6 ered ed rene, Swe e e tha Ree teed wae We Meee oe Stae4 5-1 
iSG0'64-Bit_ MICLODIOCESSOl nc <c.6-0 sg eke edue see ieee eee eee ee aae 6-1 
iG-36/286 G COMIC! a ch Gncieteey ub years Sen eu eee Renal Ow Ree CARR Uae hh ale ewe 4-8 
ICD-486 In-Circuit Debugger. ............. cece ec eee ee eee See Rvatacens ieee eb ard Sse Macatee 8-23 
ICE-186 and ICE-188 In-Circuit Emulators ............. 02... e ce eee eee dime et Gee ak ear 4-22 
ICE-186EB and ICE-188EB In-Circuit Emulators. ....... 0.0... cece cece eee eee ees 4-25 
IGE-286:1n-Circuil CMUlalOn 235. cones Gaeta aha bod Caw ew eran Poke oie ceed 4-32 
Intel 376 Family Development Support............. 0. ccc ccc ce cee eee een e eens 8-11 
Intel386 Family of In-Circuit Emulators .........0 0... ccc cece eee eee ee te ee nees 8-29 
Intel386/i486 Family Development supper Bcahasib carr Grlaatieatn aes Cutie ssa cose eae ee aun Giana mee oer 8-1 
inteli486:In-Gircul- Emulator .vic:.5.. hehe Seale ahr y Oe ead we Gare hoes oon be ee Oeee 8-55 
iIPAT Performance Analysis Tool ............ 00. cece ccc ce eee ee eee ee een eee e ee ee anes 4-14 


i | 


Intel386™ Family Ls 


intel’ 


i486T™ MICROPROCESSOR 


m Binary Compatible with Large m High Performance Design 
Software Base — Frequent Instructions Execute in One 
— MS-DOS*, OS/2**, Windows Clock 
— UNIX*** System V/386 — 25 MHz and 33 MHz Clock 
— IRMX®, iRMKT Kernels Frequencies 


: ; ‘ — 80 and 106 Mbyte/Sec Burst Bus 
m High Integration Enables On-Chi 
ei Kbyte Code and Data Cache ~ CHMOS IV Process Technology 
— Floating Point Unit — Dynamic Bus Sizing for 8-, 16- and 
— Paged, Virtual Memory Management 32-Bit Busses 
m Complete 32-Bit Architecture 
™ marl a ae Test — Address and Data Busses 


— Registers 
— Hardware Debugging Su ort : 
— Intel Software eupeae ai — 8-, 16- and 32-Bit Data Types 
— Extensive Third Party Software m Multiprocessor Support 
Support — Multiprocessor Instructions 
m 168-Pin Grid Array Package _ =— Cache Consistency Protocols 


— Support for Second Level Cache 


The i486™ CPU offers the highest performance for DOS, OS/2, Windows and UNIX System V/386 applica- 
tions. It is 100% binary compatible with the 386™ CPU. Over one million transistors integrate cache memory, 
floating point hardware and memory management on-chip while retaining binary compatibility with previous 
members of the X86 architectural family. Frequently used instructions execute in one cycle resulting in RISC 
performance levels. An 8 Kbyte unified code and data cache combined with a 106 Mbyte/Sec burst bus at 
33.3 MHz ensure high system throughput even with inexpensive DRAMs. 


New features enhance multiprocessing systems. New instructions speed manipulation of memory based sem- 
aphores. On-chip hardware ensures cache consistency and provides hooks for multilevel caches. 


The built in self test extensively tests on-chip logic, cache memory and the on-chip paging translation cache. 
Debug features include breakpoint traps on code execution and data accesses. 


i486T™ Microprocessor Pipelined 32-Bit Microarchitecture 


64 Bit Interunit Transfer Bus 


32-bit Dota Bus 


Bus Interface A2-A31, 
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iIRMX, iRMK, 386, 387, 486, i486 are trademarks of Intel Corporation. 
*MS-DOS® is a registered trademark of Microsoft Corporation. 
**OS/2™ is a trademark of Microsoft Corporation. 
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Pin Cross Reference by Pin Name 


Intel i486™ MICROPROCESSOR 


QUICK PIN REFERENCE 


What follows is a brief pin description. For detailed signal descriptions refer to Section 6. 


— Name and Function 


Clock provides the fundamental timing and the internal operating frequency for the 486 
microprocessor. All external timing parameters are specified with respect to the rising 
edge of CLK. 


A31-—A2 are the address lines of the microprocessor. A31-—A2, together with the byte 
enables BEO # -BE3#, define the physical area of memory or input/output space 
accessed. Address lines A31-—A4 are used to drive addresses into the microprocessor to 
perform cache line invalidations. Input signals must meet setup and hold times to. and 
to3. A31-—A2 are not driven during bus or address hold. 


The byte enable signals indicate active bytes during read and write cycles. During the 
first cycle of a cache fill, the external system should assume that all byte enables are 
active. BE3# applies to D24-D31, BE2# applies to D16—D23, BE1 # applies to D8- 
D15 and BEO# applies to DO-D7. BEO# -BE3# are active LOW and are not driven 

during bus hold. 


BEO-3# i 


DATA BUS 

D31-—D0O I/O | These are the daia /ines for the 486 microprocessor. Lines DO—D7 define the least 
significant byte of the data bus while lines D24-—D31 define the most significant byte of 
the data bus. These signals must meet setup and hold times to. and tog for proper 
operation on reads. These pins are driven during the second and subsequent clocks of 
write cycles. 

DATA PARITY | 

DPO-DP3; I/O | There is one data parity pin for each byte of the data bus. Data parity is generated on all 


write data cycles with the same timing as the data driven by the 486 microprocessor. 
Even parity information must be driven back into the microprocessor on the data parity 
pins with the same timing as read information to insure that the correct parity check 
status is indicated by the 486 microprocessor. The signals read on these pins do not 
affect program execution. 

Input signals must meet setup and hold times too and te3. DPO—DP3 should be 
connected to Vcc through a pullup resistor in systems which do not use parity. DPO-—DP3 
are active HIGH and are driven during the second and subsequent clocks of write cycles. 


Parity Status is driven on the PCHK # pin the clock after ready for read operations. The 
parity status is for data sampled at the end of the previous clock. A parity error is 
indicated by PCHK # being LOW. Parity status is only checked for enabled bytes as 
indicated by the byte enable and bus size signals. PCHK # is valid only in the clock 
immediately after read data is returned to the microprocessor. At all other times PCHK # 
is inactive (HIGH). PCHK # is never floated. 


BUS CYCLE DEFINITION 


The memory/input-output, data/control and write/read lines are the primary bus 
definition signals. These signals are driven valid as the ADS # signal is asserted. 


M/lIO# D/C# W/R# Bus Cycle Initiated 
Interrupt Acknowledge 
Halt/Special Cycle 


1/O Read 
1/O Write 


Code Read 
Reserved 
Memory Read 
Memory Write 


The bus definition signals are not driven during bus hold and follow the timing of the 
address bus. Refer to Section 7.2.11 for a description of the special bus cycles. 
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QUICK PIN REFERENCE (Continued) 


Name and Function | 


The bus /ock pin indicates that the current bus cycle is locked. The 486 microprocessor 
will not allow a bus hoid when LOCK # is asserted (but address holds are allowed). 
LOCK # goes active in the first clock of the first locked bus cycle and goes inactive after 
the last clock of the last locked bus cycle. The last locked cycle ends when ready is 
returned. LOCK # is active LOW and is not driven during bus hold. Locked read 2 
will not be transformed into cache fill cycles if KEN # is returned active. 


PLOCK # | The pseudo-lock pin indicates that the current bus transaction requires more than one 


bus cycle to complete. Examples of such operations are floating point long reads and 
writes (64 bits), segment table descriptor reads (64 bits), in addition to cache line fills 
| (128 bits). The 486 microprocessor will drive PLOCK # active until the addresses for the 
last bus cycle of the transaction have been driven regardless of whether RDY # or 
BRDY # have been returned. 
Normally PLOCK # and BLAST # are inverse of each other. However during the first bus 
cycle of a 64-bit floating point write, both PLOCK# and BLAST # will be asserted. 
PLOCK # is a function of the BS8#, BS16# and KEN# inputs. PLOCK# should be 
sampled only in the clock ready i is returned. PLOCK # is active LOW and is not driven 
during bus hold. 


The address status output indicates that a valid bus cycle definition and address are 
available on the cycle definition lines and address bus. ADS # is driven active in the same 
clock as the addresses are driven. ADS # is active LOW and is not driven during bus hold. 


The non-burst ready input indicates that the current bus cycle is complete. RDY # - 
indicates that the external system has presented valid data on the data pins in response 
to a read or that the external system has accepted data from the 486 microprocessor in 
response to a write. RDY # is ignored when the bus is idle and at the end of the first clock | 
of the bus cycle. | 

RDY # is active during address hold. Data can be returned to the processor while AHOLD 
is active. 


BUS CONTROL 
RDY # is active LOW, and is not provided with an internal pullup resistor. RDY # must 


RDY # 
satisfy setup and hold times ne and nfl for proper chip operation. 


BURST CONTROL 


The burst ready input performs the same function during a burst cycle that RDY # 
performs during a non-burst cycle. BRDY # indicates that the external system has 
presented valid data in response to a read or that the external system has accepted data 
in response to a write. BRDY # is ignored when the bus is idle and at the end of the first 
clock in a bus. cycle. 
BRDY # is sampled in the second and subsequent clocks of a burst cycle. The data 
presented on the data bus will be strobed into the microprocessor when BRDY # is 
sampled active. If RDY # is returned simultaneously with BRDY #, BRDY # is ignored and 
the burst cycle is prematurely aborted. 
BRDY # is active LOW and is provided with a small pullup resistor. BRDY # must satisfy 
the setup and hold times.t1, and ty7. 


The burst last signal indicates that the next time BRDY # is returned the burst bus cycle is 
complete. BLAST # is active for both burst and non-burst bus cycles. BLAST # is active 
LOW and is not driven during bus hold. 


‘| BLAST# 7 


5-10: 


i486™ MICROPROCESSOR 


QUICK PIN REFERENCE (Continued) 


| Symbol | Type | Name and Function 


INTERRUPTS 


The reset input forces the 486 microprocessor to begin execution at a known state. The 
microprocessor cannot begin execution of instructions until at least 1 ms after Voc and 
CLK have reached their proper DC and AC specifications. The RESET pin should remain 
active during this time to insure proper microprocessor operation. RESET is active HIGH. 
RESET is asynchronous but must meet setup and hold times too and to; for recognition in 
any specific clock. 


The maskable interrupt indicates that an external interrupt has been generated. If the 
internal interrupt flag is set in EFLAGS, active interrupt processing will be initiated. The 
486 microprocessor will generate two locked interrupt acknowledge bus cycles in 


response to the INTR pin going active. INTR must remain active until the interrupt 
acknowledges have been performed to assure that the interrupt is recognized. 

INTR is active HIGH and is not provided with an internal pulldown resistor. INTR is 
asynchronous, but must meet setup and hold times tap and to, for recognition in any 
specific clock. | 


The non-maskable interrupt request signal indicates that an external non-maskable 
interrupt has been generated. NMI is rising edge sensitive. NMI must be held LOW for at 
least four CLK periods before this rising edge. NM! is not provided with an internal 
pulldown resistor. NMI is asynchronous, but must meet setup and hold times tao and to, 
for recognition in any specific clock. 


BUS ARBITRATION 


~ 


: ; 
- 7 


BOFF # 7 


The internal cycle pending signal indicates that the 486 microprocessor has internally 
generated a bus request. BREQ is generated whether or not the 486 microprocessor is 
driving the bus. BREQ is active HIGH and is never floated. 


The bus hold request allows another bus master complete control of the 486 
microprocessor bus. In response to HOLD going active the 486 microprocessor will float 
most of its output and input/output pins. HLDA will be asserted after completing the 


current bus cycle, burst cycle or sequence of locked cycles. The 486 microprocessor will 
remain in this state until HOLD is deasserted. HOLD is active high and is not provided with 
an internal pulldown resistor. HOLD must satisfy setup and hold times tg and ty9 for 
proper operation. | 


Hold acknowledge goes active in response to a hold request presented on the HOLD pin. 
HLDA indicates that the 486 microprocessor has given the bus to another local bus 
master. HLDA is driven active in the same clock that the 486 microprocessor floats its 
bus. HLDA is driven inactive when leaving bus hold. HLDA is active HIGH and remains 
driven during bus hold. 


The backoff input forces the 486 microprocessor to float its bus in the next clock. The 
microprocessor will float all pins normally floated during bus hold but HLDA will not be 
asserted in response to BOFF #. BOFF # has higher priority than RDY # or BRDY #; if 
both are returned in the same clock, BOFF # takes effect. The microprocessor remains in 
bus hold until BOFF # is negated. If a bus cycle was in progress when BOFF # was 
asserted the cycle will be restarted. BOFF # is active LOW and must meet setup and hold 
times t;g and t;9 for proper operation. 


CACHE INVALIDATION 


AHOLD 


The address hold request allows another bus master access to the 486 microprocessor’s 
address bus for a cache invalidation cycle. The 486 microprocessor will stop driving its 
address bus in the clock following AHOLD going active. Only the address bus will be 


floated during address hold, the remainder of the bus will remain active. AHOLD is active 
HIGH and is provided with a small internal pulldown resistor. For proper operation AHOLD 
must meet setup and hold times t;g and tyg9. | 
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~QUICK PIN REFERENCE (Continued) 


| Symbol | Type. es Name and Function _ 


CACHE INVALIDATION (Continued) 


This signal indicates that a valid external address has been driven onto:the 486 
microprocessor address pins. This address will be used to perform an internal cache 

invalidation cycle. EADS # is active LOW and is provided with an internal pullup resistor. 
EADS # must satisfy setup and hold times ty2 and ty3 for proper operation. 


cache. KEN # is active LOW and is provided with a small internal pullup resistor. KEN # 
must satisfy setup and hold times t;4 and ty5 for proper operation. 


| CACHE CONTROL | | 
The cache enable pin is used to determine whether the current cycle is cacheable. When 
the 486 microprocessor generates a cycle that can be cached and KEN # is active, the 
cycle will become a cache line fill cycle. Returning KEN # active one clock before ready 
FLUSH # The cache flush input forces the 486 microprocessor to flush its entire internal cache. 
| FLUSH # is active low and need only be asserted for one clock. FLUSH # is 
asynchronous but setup and hold times tap and to, must be met for recognition in any 
| specific clock. FLUSH # being sampled low in the clock before the falling edge of RESET 
causes the 486 micieE ee to enter the tri-state test mode. 
PAGE CACHEABILITY ; 
PWT 


during the last read in the cache line fill will cause the line to be placed in the on-chip 
PCD 


The page write-through and page cache disable pins reflect the state of the page 
attribute bits, PWT and PCD, in the page table entry or page directory entry. If paging is 

disabled or for cycles that are not paged, PWT and PCD reflect the state of the PWT and 
PCD bits in control register 3. PWT and PCD have the same timing as the cycle definition — 
pins (M/IO#, D/C# and W/R#). PWT and PCD are active HIGH and are not driven 
during bus hold. PCD is masked by the cache disable bit (CD) in Control Register 0. 


NUMERIC ERROR REPORTING 


| FERR# 7 The floating point error pin is driven active when a floating point error occurs. FERR # is 


similar to the ERROR # pin on the 387™ math coprocessor. FERR # is included for 
IGNNE # 


compatibility with systems using DOS type floating point error reporting. FERR # will not 
go active if FP errors are masked in FPU register. FERR # is active LOW, and is not 
floated during bus hold. 


When the /gnore numeric error pin is asserted the 486 microprocessor will ignore a 
numeric error and continue executing non-control floating point instructions, but FERR # 
will still be activated by the i486. When IGNNE # is deasserted the 486 microprocessor 
will freeze on a non-control floating point instruction, if a previous floating point instruction 
caused an error. IGNNE # has no effect when the NE bit in control register 0 is set. 
IGNNE # is active LOW and is provided with a small internal pullup resistor. IGNNE# is 
asynchronous but setup and hold times too and tain must be met to insure recognition on 
any specific clock. . 


| BUS SIZE CONTROL | | | | | | 
BS16# The bus size 16 and bus size 8 pins (bus sizing pins) cause the 486 microprocessor to run 
BS8 # multiple bus cycles to complete a request from devices that cannot provide or accept 32 


bits of data in a single cycle. The bus sizing pins are sampled every clock. The state of 
these pins in the clock before ready is used by the 486 microprocessor to determine the | 
bus size. These signals are active LOW and are provided with internal pullup resistors. 
These inputs must satisfy setup and hold times t;4 and t;5 for proper operation. | 
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QUICK PIN REFERENCE (Continued) 


| Symbol | Type | Name and Function 


ADDRESS MASK 


When the address bit 20 mask pin is asserted, the 486 microprocessor masks physical 
address bit 20 (A20) before performing a lookup to the internal cache or driving a memory 
cycle on the bus. A2OM# emulates the address wraparound at one Mbyte which occurs 
on the 8086. A20M # is active LOW and should be asserted only when the processor is in 
real mode. This pin is asynchronous but should meet setup and hold times tao and to, for 
recognition in any specific clock. For proper operation, A2ZQOM# should be sampled high at 
the falling edge of RESET. 


Table 1.1. Output Pins Table 1.2. Input Pins 
Active When Active | Synchronous/ 
mame ASR] rontee | |__ same | Stet | Rpveonons 
BREQ CLK 
HLDA RESET Asynchronous 
BEO # -BE3# Bus Hold HOLD Synchronous 
PWT, PCD Bus Hold AHOLD Synchronous 
W/R#, D/C#, M/IO# Bus Hold EADS # Synchronous 
LOCK # Bus Hold BOFF # Synchronous 
PLOCK # Bus Hold FLUSH # Asynchronous 
ADS # Bus Hold A20M # Asynchronous 
BLAST # Bus Hold BS16#, BS8# Synchronous 
PCHK # KEN # Synchronous 
FERR # RDY # Synchronous 
A2-A3 HIGH | Bus, Address Hold BRDY # Synchronous 
INTR Asynchronous 
NMI Asynchronous 
IGNNE # Asynchronous 


Table 1.3. Input/Output Pins Table 1.4 Component and Revision ID 
Active When i486™ CPU Component Revision 
Level Floated Stepping Name ID ID 
DO-D31 HIGH Bus Hold B3 04 01 
DPO-DP3 HIGH Bus Hold B4 04 01 


A4-A31 HIGH Bus, Address Hold 


B5 04 01 
B6 04 01 
CO 04 02 
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2.0 ARCHITECTURAL OVERVIEW 


The 486 microprocessor is a 32-bit architecture with. 


on-chip memory management, floating point and 
cache memory units. 


The 486 microprocessor contains all the features of 


the 386™ microprocessor with enhancements to in-- 


crease performance. The instruction set includes the 
complete 386 microprocessor instruction set along 
_ with extensions to serve new applications. The on- 
chip memory management unit (MMU) is completely 
compatible with the 386 microprocessor MMU. The 
486 microprocessor brings the 387™ math coproc- 
essor on-chip. All software written for the 386 micro- 
processor, 387 math coprocessor and previous 
members of the 86/87 architectural family will run on 
the 486 microprocessor without any modifications. 


Several enhancements have been added to the 486 
microprocessor to increase performance. On-chip 
cache memory allows frequently used data and 
code to be stored on-chip reducing accesses to the 
external bus. RISC design techniques have been 


used to reduce instruction cycle times. A burst bus © 


feature enables fast cache fills. All of these features, 
combined, lead to performance greater than twice 
that of a 386 RU OR OC ees0r 


The memory management unit (MMU) consists of a 
segmentation unit and a paging unit. Segmentation 
allows management of the logical address space by 
providing easy data and code relocatibility and effi- 
cient sharing of global resources. The paging mech- 
anism operates beneath segmentation and is trans- 
parent to the segmentation process. Paging is op- 
tional and can be disabled by system software. Each 
segment can be divided into one or more 4 Kbyte 
segments. To implement a virtual memory system, 
the 486 microprocessor supports full ies tantapiity 
for all page and segment faults. 


Memory is organized into one or more variable 
length segments, each up to four gigabytes (282 
bytes) in size. A segment can have attributes associ- 
ated with it which include its location, size, type (i.e., 
stack, code or data), and protection characteristics. 
Each task on a 486 microprocessor can have a max- 
imum of 16,381 segments, each up to four gigabytes 
in size. Thus each task has a maximum of 64 tera- 
bytes (trillion bytes) of virtual memory. 


The segmentation unit provides four-levels of pro- 
‘tection for isolating and protecting applications and 
- the operating system from each other. The hardware 
enforced protection allows the design of systems 
with a high a ioe of integrity. 


The 486 microprocessor has two modes of opera- 


tion: Real Address Mode (Real Mode) and Protected 
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Mode Virtual Address Mode (Protected Mode). In 
Real Mode the 486 microprocessor operates as a 


very fast 8086. Real. Mode is required primarily to set 


up the processor for Protected Mode operation. Pro- 


tected Mode provides access to the sophisticated 


memory management paging and pave? capabili- 
ties of the processor. 


Within Protected Mode, software can perform a task 
switch to enter into tasks designated as Virtual 8086 
Mode tasks. Each virtual 8086 task behaves with 
8086 semantics, allowing 8086 software (an applica- 
tion program or an entire operating system) to exe- 
cute. 


The on-chip floating point unit operates in parallel 
with the arithmetic and logic unit and provides arith- 
metic instructions for a variety of numeric data types. 


It executes numerous built-in transcendental func- 


tions (e.g., tangent, sine, cosine, and log functions). 
The floating point unit fully conforms to the ANSI/ 
IEEE standard 754-1985 for floating point arithmetic. 


The on-chip cache i is 8 Kbytes in size. It is 4-way set 
associative and follows a write-through policy. The 
on-chip cache includes features to provide flexibility 
in external memory system design. Individual pages 
can be designated as cacheable or non-cacheable 
by software or hardware. The cache can also be en- 
abled and disabled by software or hardware. 


Finally the 486 microprocessor has features to facili- 
tate high performance hardware designs. The 1X 
clock eases high frequency board level designs. The 


burst bus feature enables fast cache fills. These fea-_ - 


tures are described beginning in Section 6. 


2.1 Register Set 


‘The 486 microprocessor register set includes all the 
registers contained in the 386 microprocessor and 


the 387 math coprocessor. The register set can be 
split into the following categories: 


Base Architecture Registers 
General Purpose Registers : 
Instruction Pointer 
Flags Register 
Segment Registers 


Systems Level Registers 
Control Registers 
System Address Registers 
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Floating Point Registers 
Data Registers 
.Tag Word 
Status Word 
Instruction and Data Pointers 
Control Word 


Debug and Test Registers 


The base architecture and floating point registers 
~ are accessible by the applications program. The sys- 
tem level registers are only accessible at privilege 
level O and are used by the systems level program. 
The debug and test registers are also only accessi- 
ble at privilege level 0. 


2.1.1 BASE ARCHITECTURE REGISTERS 


Figure 2.1 shows the 486 microprocessor base ar- 
chitecture registers. The contents of these registers 
are task-specific and are automatically loaded with a 
new context upon a task switch operation. 


General Purpose Registers | 


Segment Registers 


Code Segment 
Stack Segment 


Data Segments 


15 Oo 
CS 
SS 
DS 
ES 
FS 
GS 


Instruction Pointer 
31 16 15 


0 
: | Flags Register 


a EFLAGS 


Figure 2.1. Base Architecture Registers 
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The base architecture includes six directly accessi- 
ble descriptors, each specifying a segment up to 4 
Gbytes in size. The descriptors are indicated by the 
selector values placed in the 486 microprocessor 
segment registers. Various selector values can be 
loaded as a program executes. 


The selectors are also task-specific, so the segment 
registers are automatically loaded with new context 
upon a task switch operation. 


2.1.1.1 General Purpose Registers 


The eight 32-bit general purpose registers are 
shown in Figure 2.1. These registers hold data or 
address quantities. The general purpose registers 
can support data operands of 1, 8, 16 and 32 bits, 
and bit fields of 1 to 32 bits. Address operands of 16 
and 32 bits are supported. The 32-bit registers are 
named EAX, EBX, ECX, EDX, ESI, EDI, EBP and 
ESP. 


The least significant 16 bits of the general purpose 
registers can be accessed separately by using the 
16-bit names of the registers AX, BX, CX, DX, Sl, DI, 
BP and SP. The upper 16 bits of the register are not 
changed when the lower 16 bits are accessed sepa- 
rately. 


Finally 8-bit operations can individually access the 
lowest byte (bits 0-7) and the higher byte (bits 8- 
15) of the general purpose registers AX, BX, CX and 
DX. The lowest bytes are named AL, BL, CL and DL 
respectively. The higher bytes are named AH, BH, 
CH and DH respectively. The individual byte acces- 
sibility offers additional flexibility for data operations 
but is not used for effective address calculation. 


2.1.1.2 Instruction Pointer 


The instruction pointer, shown in Figure 2.1, is a 32- 
bit register named EIP. EIP holds the offset of the 
next instruction to be executed. The offset is always | 
relative to the base of the code segment (CS). The 
lower 16 bits (bits O- 15) of the EIP contain the 16-bit 
instruction pointer named IP, which is used for 16-bit 
addressing. 


2.1.1.3 Flags Register 


. The flags register is a 32-bit register named 


EFLAGS. The defined bits and bit fields within 
EFLAGS control certain operations and indicate 
status of the 486 microprocessor. The lower 16 bits 
(bit O-15) of EFLAGS contain the 16-bit register 


~ named FLAGS, which is most useful when executing 


8086 and 80286 code. EFLAGS is shown in Figure 
2.2. 


EFLAGS — 


ALIGNMENT CHECK- 
VIRTUAL MODE 
RESUME FLAG 
NESTED TASK FLAG == 
1/O PRIVILEGE LEVEL 

. OVERFLOW 
DIRECTION FLAG — 
INTERRUPT ENABLE 


NOTE: 


0 indicates Intel Reserved: do not define; see Section 2.1.6. 
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FLAGS 


1 
09876543210 


CARRY FLAG 
PARITY FLAG 
AUXILIARY CARRY 
ZERO FLAG 

SIGN FLAG 

TRAP FLAG 


' 240440-6 


Figure 2.2. Flags Register 


EFLAGS bits 1, 3, 5, 15 and 19-31 are “undefined’’. 
When these bits are stored during interrupt process- 
_ ing or with a PUSHF instruction (push flags onto 

stack), a one is stored in bit 1 and zeros in bits 3, 5, 
15 and 19-31. » 


The EFLAGS register in the 486 nideioproesseor 
contains a new bit not previously defined. The new 
bit, AC, is defined in the upper 16 bits of the register 


and it enables faults on accesses to misaligned — 


data. 

AC (Alignment Check, bit 18) © . 

| The AC bit enables the generation of faults if a 

-memory reference is to a misaligned address. 
Alignment faults are enabled when AC is set 


to 1. A mis-aligned address is a word access 


to an odd address, a dword access to an ad- 
dress that is not on a dword boundary, or an 
8-byte reference to an address that is not ona 
64-bit word boundary. See Section 7.1.6 for 
more information on operand alignment. 


Alignment faults are only generated by pro- 
grams running at privilege level 3. The AC bit 
setting is ignored at privilege levels 0, 1 and 2. 


Note that references to the descriptor tables 


(for selector loads), or the task state segment 
(TSS), are implicitly level 0 references even if 
the instructions causing the references are 
executed at level 3. Alignment faults are re- | 
ported through interrupt 17, with an error code 
of 0. Table 2.1 gives the alignment required 
for the 486 microprocessor data types. 


Table 2.1. Data Type Alignment Requirements _ 


| Alignment (Byte Boundary) 


Word 

Dword 

Single Precision Real 
Double Precision Real 
Extended Precision Real 
Selector | 


48-Bit Segmented Pointer 
32-Bit Flat Pointer 
32-Bit Segmented Pointer 
_. _ 48-Bit “Pseudo-Descriptor” 
_ FSTENV/FLDENV Save Area 
FSAVE/FRSTOR Save Area 
Bit String | 
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4/2 (On Operand Size) 
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4/2 (On Operand Size) © 
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IMPLEMENTATION NOTE: 

Several instructions on the 486 microprocessor 
generate misaligned references, even if their mem- 
ory address is aligned. For example, on the 486 mi- 
croprocessor, the SGDT/SIDT (store global/inter- 
rupt descriptor table) instruction reads/writes two 
bytes, and then reads/writes four bytes from a 
“pseudo-descriptor’ at the given address. The 486 
microprocessor will generate misaligned references 
unless the address is on a 2 mod 4 boundary. The 
FSAVE and FRSTOR instructions (floating point 
save and restore state) will generate misaligned 
references for one-half of the register save/restore 
cycles. The 486 microprocessor will not cause any 
AC faults if the effective address given in the in- 
struction has the proper alignment. 


VM __ (Virtual 8086 Mode, bit 17) 


The VM bit provides Virtual 8086 Mode within 
Protected Mode. If set while the 486 Micro- 
processor is in Protected Mode, the 486 Mi- 
croprocessor will switch to Virtual 8086 opera- 
tion, handling segment loads as the 8086 
does, but generating exception 13 faults on 
privileged opcodes. The VM bit can be set 
only in Protected Mode, by the IRET instruc- 
tion (if current privilege level = 0) and by task 

_ switches at any privilege level. The VM bit is 
unaffected by POPF. PUSHF always pushes a 
O in this bit, even if executing in Virtual 8086 
Mode. The EFLAGS image pushed during in- 
terrupt processing or saved during task 
switches will contain a 1 in this bit if the inter- 
rupted code was executing as a Virtual 8086 
Task. 


RF (Resume Flag, bit 16) 


The RF flag is used in conjunction with the 
debug register breakpoints. It is checked at 
instruction boundaries before breakpoint pro- 
cessing. When RF is set, it causes any debug 
fault to be ignored on the next instruction. RF 
is then automatically reset at the successful 
completion of every instruction (no faults are 
signalled) except the IRET instruction, the 
- POPF instruction, (and JMP, CALL, and INT 
instructions causing a task switch). These in- 
structions set RF to the value specified by the 
memory image. For example, at the end of the 
breakpoint service routine, the IRET instruc- 
tion can pop an EFLAG image having the RF 
bit set and resume the program’s execution at 
the breakpoint address without generating an- 
other breakpoint fault on the same location. 


NT (Nested Task, bit 14) 
_ This flag applies to Protected Mode. NT is set 


to indicate that the execution of this task is 
nested within another task. If set, it indicates 
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OF 


DF 


TF 
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that the current nested task’s Task State Seg- 
ment (TSS) has a valid back link to the previ- 
ous task’s TSS. This bit is set or reset by con- 
trol transfers to other tasks. The value of NT 
in EFLAGS is tested by the IRET instruction to 
determine whether to do an inter-task return 
or an intra-task return. A POPF or an IRET 
instruction will affect the setting of this bit ac- 
cording to the image popped, at any privilege 
level. 


(Input/Output Privilege Level, bits 12-13) 


This two-bit field applies to Protected Mode. 
IOPL indicates the numerically maximum CPL 
(current privilege level) value permitted to ex- 
ecute I/O instructions without generating an 
exception 13 fault or consulting the I/O Per- 
mission Bitmap. It also indicates the maximum 
CPL value allowing alteration of the IF (INTR 
Enable Flag) bit when new values are popped 
into the EFLAG register. POPF and IRET in- 
struction can alter the IOPL field when execut- 
ed at CPL = 0. Task switches can always al- 
ter the IOPL field, when the new flag image is 
loaded from the incoming task’s TSS. 


(Overflow Flag, bit 11) 


OF is set if the operation resulted in a signed 
overflow. Signed overflow occurs when the 
operation resulted in carry/borrow into the 
sign bit (high-order bit) of the result but did not 
result in a carry/borrow out of the high-order 
bit, or vice-versa. For 8-, 16-, 32-bit opera- 
tions, OF is set according to overflow at bit 7, 
15, 31, respectively. 


(Direction Flag, bit 10) 


DF defines whether ESI and/or EDI registers 
postdecrement or postincrement during the 
string instructions. Postincrement occurs if DF 
is reset. Postdecrement occurs if DF is set. 


(INTR Enable Flag, bit 9) 


The IF flag, when set, allows recognition of 
external interrupts signalled on the INTR pin. 
When IF is reset, external interrupts signalled 
on the INTR are not recognized. IOPL indi- 
cates the maximum CPL value allowing altera- 
tion of the IF bit when new values are popped 
into EFLAGS or FLAGS. 


(Trap Enable Flag, bit 8) 


TF controls the generation of exception 1 trap 
when single-stepping through code. When TF 
is set, the 486 Microprocessor generates an 
exception 1 trap after the next instruction is 
executed. When TF is reset, exception 1 traps 
occur only as a function of the breakpoint ad- 
dresses loaded into debug registers DRO- 
DR3. 
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SF (SignFlag,bit7) = NOTE: 
SF is set if the high-order bit of the result is! these descriptions, “set” means “set to 1,” and 
set, it is reset otherwise. For 8-, 16-, 32-bit "eset means ‘reset to 0. er ee: 


operations, SF reflects the state of bit 7, 15, 
| 31 respectively. — 
ZF (Zero Flag, bit 6) 
 - ZF is set if all bits of the result are 0. Other- 
wise it is reset. . 


2.1.1.4 Segment Registers 


Six 16-bit segment registers hold segment selector 
values identifying the currently addressable memory 
segments. In protected mode, each segment may 


AF (Auxiliary Carry Flag, bit 4) | | range in size from one byte.up to the entire linear 
The Auxiliary Flag is used to simplify the addi- and physical address space of the machine, 4 
tion and subtraction of packed BCD quanti: | Gbytes (292 bytes). In real address mode, the maxi- 


ties. AF is set if the operation resulted ina | Mum segment size is fixed at 64 Kbytes (216 bytes). _ 
carry out of bit 3 (addition) or a borrow into bit ae - ay a eee 
3 (subtraction). Otherwise AF is reset. AF is The six addressable segments are defined by the 
affected by carry out of, or borrow into bit 3 segment registers CS, SS, DS, ES, FS and GS. The 


only, regardless of overall operand length: 8, | selector in CS indicates the current code segment; 
| 16 or 32 bits. | the selector in SS indicates the current stack seg- 
PF (Parity Flags, bit 2) a ment; the selectors in DS, ES, FS and GS indicate 


, the current data segments. 3 

PF is set if the low-order eight bits of the oper- . Sy , 7 

ation contains an even number of ‘‘1’s” (even 7 : : dy Bho 38 
parity). PF is reset if the low-order eight bits 2.1.1.5 Segment Descriptor Cache Registers 
have odd parity. PF is a function of only the 
low-order eight bits, regardless of operand 
size. : 


The segment descriptor cache registers are not pro- 
grammer visible, yet it is very useful to understand 
| their content. A programmer invisible descriptor 
CF (Carry Flag, bit 0) cache register is associated with each programmer- 
CF is set if the operation resulted in a carry visible segment register, as shown by Figure 2.3. 
out of (addition), or a borrow into (subtraction) Each descriptor cache register holds a 32-bit base 
_ the high-order bit. Otherwise CF is reset. For address, a 32-bit segment limit, and the other neces- 
_ 8-, 16- or 32-bit operations, CF is set accord- | sary segment attributes. | 
ing to carry/borrow at bit 7, 15 or 31, respec- ; | 
tively. | 


SEGMENT : | i. | err 
REGISTERS DESCRIPTOR REGISTERS (LOADED AUTOMATICALLY) 
Sh ee ee Or 
pa & | | | Segment | , 
15 0 Physical Base Address Segment Limit Attributes from Descriptor 


| seector 


| [setector 


7 Figure 2.3. i486™ Microprocessor Segment Registers and Associated Descriptor Cache Registers 
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When a selector value is loaded into a segment reg- 
ister, the associated descriptor cache register is au- 
tomatically updated with the correct information. In 
Real Address Mode, only the base address is updat- 
ed directly (by shifting the selector value four bits to 
the left), since the segment maximum limit and attri- 
butes are fixed in Real Mode. In Protected Mode, 
the base address, the limit, and the attributes are all 
updated per the contents of the segment descriptor 
indexed by the selector. 


Whenever a memory reference occurs, the segment 
descriptor cache register associated with the seg- 
ment being used is automatically involved with the 
memory reference. The 32-bit segment base ad- 
dress becomes a component of the linear address 
calculation, the 32-bit limit is used for the limit-check 
operation, and the attributes are checked against 
the type of memory reference requested. 


2.1.2 SYSTEM LEVEL REGISTERS 


The system level registers, Figure 2.4, control opera- 
tion of the on-chip cache, the on-chip floating point 
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unit (FPU) and the segmentation and paging mecha- 
nisms. These registers are only accessible to pro- 
grams running at privilege level 0, the highest privi- 
lege level. 


The system level registers include three control reg- 
isters and four segmentation base registers. The 
three control registers are CRO, CR2 and CR3. CR1 
is reserved for future Intel processors. The four seg- 
mentation base registers are the Globai Descriptor 
Table Register (GDTR), the Interrupt Descriptor Ta- 
ble Register (IDTR), the Local Descriptor Table Reg- 
ister (L.DTR) and the Task State Segment Register 
(TR). ; 


2.1.2.1 Control Registers 
Control Register 0 (CRO) 


CRO, shown in Figure 2.5, contains 10 bits for con- 
trol and status purposes. Five of the bits defined in 
the 486 microprocessor’s CRO are newly defined. 
The new bits are CD, NW, AM, WP and NE. The 
function of the bits in CRO can be categorized as 
follows: 


PAGE FAULT LINEAR ADDRESS REGISTER 


PAGE DIRECTORY BASE REGISTER ee 


. SYSTEM ADDRESS REGISTERS 
47 32-BiT LINEAR BASE ADDRESS 16 15 


GDTR 
IDTR 


SYSTEM SEGMENT 
REGISTERS 


15 0 


TR SELECTOR 
LDTR SELECTOR 


3 


LIMIT 


DESCRIPTOR REGISTERS (AUTOMATICALLY LOADED) 


fe 


2-BIT LINEAR BASE ADDRESS 


20-BIT SEGMENT LIMIT ATTRIBUTES 


Figure 2.4. System Level Registers 


NOTE: 
indicates Intel reserved: Do not define; See Section 2.1 


6 


Figure 2.5. Control Register 0 
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486 Microprocessor Operating Modes: PG, PE 
(Table 2.2) | a we a 
On-Chip Cache Control Modes: CD, NW (Table 2.3) 


On-Floating Point Unit Control: TS, EM, MP, NE 
(Table 2.4) | | 


Alignment Check Control: AM 
Supervisor Write Protect: WP 


Table 2.2. Processor Operating Modes 
fea[ Pe] Mode 
REAL Mode. Exact 8086 semantics, 
with 32-bit extensions available with 
prefixes. So, 

Protected Mode. Exact 80286 
semantics, plus 32-bit extensions 
through both prefixes and “default” | 
prefix setting associated with code 
segment descriptors. Also, a sub- 

| mode is defined to support a virtual 

~ | 8086 within the context of the © 

| extended 80286 protection model. 
UNDEFINED. Loading CRO with this 


combination of PG and PE bits will 
raise a GP fault with error code 0. 


Paged Protected Mode. All the 
facilities of Protected mode, with 
paging enabled underneath 
segmentation. _ 


Cache fills disabled, write-through and 
invalidates disabled. 

Cache fills disabled, write-through and 
invalidates enabled. 

INVALID. If CRO is loaded with this 
configuration of bits, a GP fault with 
error code is raised. | 
Cache fills enabled, write-through and 
invalidates enabled. - 


Table 2.4. On-Chip Floating Point Unit Control 


|. CROBIT Instruction Type 


Execute Execute 

Execute Execute 

- Trap 7 Execute 
Trap 7 Trap 7 

Trap 7 Execute 

Trap 7 Execute 

Trap 7 Execute 
Trap 7 Trap 7 


0 

0) 

1 

1 

0 

0 

1 
7 


0 0 
0 1 
0 O0- 
0 ia 
1 0 
1 1 
1 0 
1 1 


i486™ MICROPROCESSOR 


The low-order 16 bits of CRO are also known as the 
Machine Status Word (MSW), for compatibility with 
the 80286 protected mode. LMSW and SMSW (load 
and store MSW) instructions are taken as special 
aliases of the load and store CRO operations, where 
only the low-order 16 bits of CRO are involved. The 
LMSW and SMSW instructions in the 486 microproc- 
essor work in an identical fashion to the LMSW and 
SMSW instructions in the 80286 (i.e., they only oper- 
ate on the low-order 16 bits of CRO and ignores the 
new bits). New 486 microprocessor operating sys- 
tems should use the MOV CRO, Reg instruction. | 


_ The defined CRO bits are described below. 
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PG (Paging Enable, bit 31) a 
The PG bit is used to indicate whether paging is 
enabled (PG= 1) or disabled (PG=0). See Ta- 
ble 2.2. | | | 


(Cache Disable, bit 30) 


The CD bit is used to enable the on-chip cache. 
When CD=1, the cache will not be filled on 
cache misses. When CD=0, cache fills may be 
performed on misses. See Table 2.3. 


The state of the CD bit, the cache enable input 
pin (KEN#), and the relevant page cache dis- 
able (PCD) bit determine if a line read in re- 
sponse to a cache miss will be installed in the 
cache. A line is installed in the cache only if 
CD=0 and KEN# and PCD are both zero. The 
relevant PCD bit comes from either the page 
table entry, page directory entry or control reg- 
ister 3. Refer to Section 5.6 for more details on 
page cacheability. 

CD is set to one after RESET. 

NW (Not Write-Through, bit 29) 


The NW bit enables on-chip cache write- 
throughs and write-invalidate cycles (NW=0). 
When NW=0, all writes, including cache hits, 
are sent out to the pins. Invalidate cycles are 
enabled when NW=0. During an invalidate cy- 
cle a line will be removed from the cache if the 
invalidate address hits in the cache. See Table 
2.3. : 


When NW=1, write-throughs and write-invali- 
date cycles are disabled. A write will not be sent 
to the pins if the write hits in the cache. With 
NW = 1 the only write cycles that reach the ex- 
ternal bus are cache misses. Write hits with 
NW =1 will never update main memory. Invali- 
date cycles are ignored when NW = 1. 
(Alignment Mask, bit 18) 

The AM bit controls whether the alignment 
check (AC) bit in the flag register (EFLAGS) can 
allow an alignment fault. AM=0O disables the 
AC bit. AM=1 enables the AC bit. AM=0 is the 
386 microprocessor compatible mode. 


CD 


AM 


intel 


WP 


NE 


386 microprocessor software may load incor- 
rect data into the AC bit in the EFLAGS register. 
Setting AM=0 will prevent AC faults from oc- 
curring before the 486 microprocessor has cre- 
ated the AC interrupt service routine. 


(Write Protect, bit 16) 


WP protects read-only pages from supervisor 
write access. The 386 microprocessor allows a 
read-only page to be written from privilege lev- 
els 0-2. The 486 microprocessor is compatible 
with the 386 microprocessor when WP=0. 
WP = 1 forces a fault on a write to a read-only 
page from any privilege level. Operating sys- 
tems with Copy-on-Write features can be sup- 
ported with the WP bit. Refer to Section 4.5.3 
for further details on use of the WP bit. 


(Numerics Exception, bit 5) 


The NE bit controls whether unmasked floating 
point exceptions (UFPE) are handled through 
interrupt vector 16 (NE = 1) or through an exter- 
nal interrupt (NE=0). NE=O (default at reset) 
supports the DOS operating system error re- 
porting scheme from the 8087, 80287 and 387 
math coprocessor. In DOS systems, math co- 
processor errors are reported via external inter- 
rupt vector 13. DOS uses interrupt vector 16 for 
an operating system call. Refer to Sections 
6.2.13 and 7.2.14 for more information on float- 
ing point error reporting. 


For any UFPE the floating point error output pin 
(FERR #) will be driven active. 


For NE=0, the 486 microprocessor works in 
conjunction with the ignore numeric error input 
(IGNNE #) and the FERR # output pins. When a 
UFPE occurs and the IGNNE # input is inactive, 
the 486 microprocessor freezes immediately 
before executing the next floating point instruc- 
tion. An external interrupt controller will supply 
an interrupt vector when FERR # is driven ac- 
tive. The UFPE is ignored if IGNNE# is active 
and floating point execution continues. 


NOTE: 


The freeze does not take place if the next in- 
struction is one of the control instructions 
FNCLEX, FNINIT, FNSAVE, FNSTENV, 
FNSTCW, FNSTSW, FNSTSW AX, FNENI, 
FNDISI and FNSETPM. The freeze does occur 
if the next instruction is WAIT. 


For NE=1, any UFPE will result in a software 
interrupt 16, immediately before executing the 
next non-control floating point or WAIT instruc- 
tion. The ignore numeric error input (IGNNE #) 
signal will be ignored. 
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TS (Task Switched, bit 3) 


The TS bit is set whenever a task switch opera- 
tion is performed. Execution of a floating point 
instruction with TS=1 will cause a device not 
available (DNA) fault (trap vector 7). If TS=1 
and MP=1 (monitor coprocessor in CRO) a 
WAIT instruction will cause a DNA fault. See 
Table 2.4. 


(Emulate Coprocessor, bit 2) 


The EM bit determines whether floating point 
instructions are trapped (EM = 1) or executed. If 
EM= 1, all floating point instructions will cause 
fault 7. 


EM 


NOTE: 
WAIT instructions are not affected by the state 
of EM. See Table 2.4. 


(Monitor Coprocessor, bit 1) 


The MP bit is used in conjunction with the TS bit 
to determine if WAIT instructions should trap. If 
MP=1 and TS=1, WAIT instructions cause 
fault 7. Refer to Table 2.4. The TS bit is set to 1 
on task switches by the 486 microprocessor. 
Floating point instructions are not affected by 
the state of the MP bit. It is recommended that 
the MP bit be set to one for the normal opera- 
tion of the 486 microprocessor. 


(Protection Enable, bit 0) 


The PE bit enables the segment based protec- 
tion mechanism. If PE=1 protection is enabled. 
When PE=0 the 486 microprocessor operates 
in REAL mode, with segment based protection 
disabled, and addresses formed as in an 8086. 
Refer to Table 2.2. 


MP 


PE 


All new CRO bits added to the 386 and 486 micro- 
processors, except for ET and NE, are upward com- 
patible with the 80286 because they are in register 
bits not defined in the 80286. For strict compatibility 
with the 80286, the load machine status word 
(LMSW) instruction is defined to not change the ET 
or NE bits. 


Control Register 1 (CR1) 


CR1 is reserved for use in future Intel microproces- 
sors. 


Control Register 2 (CR2) 


CR2, shown in Figure 2.6, holds the 32-bit linear ad- 
dress that caused the last page fault detected. The 
error code pushed onto the page fault handler’s 
stack when it is invoked provides additional status 
information on this page fault. 
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PAGE FAULT LINEAR ADDRESS REGISTER 


PAGE DIRECTORY BASE REGISTER 


NOTE: 
(0 indicates Intel easiiea: Do not define; See Section 2.1.6. 


Figure 2.6. Control Registers 2 and 3 


Control Register 3 (CR3) 


CR3, shown in Figure 2.6, contains the physical 
base address of the page directory table. The 486 
microprocessor page directory is always page 
aligned (4 Kbyte-aligned). This alignment is enforced 
by only storing bits 20-31 in CR3. 


In the 486 microprocessor CR3 contains two new 
bits, page write-through (PWT) (bit 3) and page 
cache disable (PCD) (bit 4). The page table entry 
(PTE) and page directory entry (PDE) also contain 
PWT and PCD bits. PWT. and PCD control page 
cacheability. When a page is accessed in external 
memory, the state of PWT and PCD are driven out 
on the PWT and PCD pins. The source of PWT and 
PCD can be CR3, the PTE or the PDE. PWT and 
~PCD are sourced from CR3 when the PDE. is being 
updated. When paging is disabled (PG = 0 in CRO), 
PCD and PWT are assumed to be 0, regardless of 
their state in CR3. 


A task switch through a task state segment (TSS) 


which changes the values in CR3, or an explicit load 
into CR3 with any value, will invalidate all cached 
page table entries in the translation womaside buffer 
(TLB). | 


The page directory base address in CR3 is a physi- 
cal address. The page directory can be paged out 
while its associated task is suspended, but the oper- 
ating system must ensure that the page directory is 
resident in physical memory before the task is dis- 
patched. The entry in the TSS for CR3 has a physi- 
cal address, with no provision for a present bit. This 
means that the page directory for a task must be 
resident in physical memory. The CR3 image in a 
TSS must point to this area, before the tasK can be 
dispatched moe its TSS. 


2.1.2.2 System Address Registers 


Four special registers are defined to reference the 
tables or segments supported by the 80286, 386 
and 486 microprocessor protection model. These ta- 
bles or segments are: 


GDT (Global Descriptor Table) 
IDT (Interrupt Descriptor Table) - 
LDT (Local Descriptor Table) 
TSS (Task State Segment) 


The addresses of these tables and segments are 
stored in special registers, the System Address and 
System Segment Registers, illustrated in Figure 2.4. 
These registers are named GDTR, IDTR, LDTR and 
TR respectively. Section 4, Protected Mode Archi- 
tecture, describes the use of these registers. 


System Address Registers: GDTR and IDTR 
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The GDTR and IDTR hold the 32-bit linear base ad- 
dress and 16-bit limit of the GDT and IDT, respec- 
tively. 


Since the GDT and IDT segments are global to all 
tasks in the system, the GDT and IDT are defined by 
32-bit linear addresses (subject to page translation if 
paging is enabled) and 16-bit limit values. 


System Segment Registers: LDTR and TR 


The LDTR and TR hold the 16-bit selector for the 
LDT descriptor and the TSS descriptor, respectively. 


Since the LDT and TSS segments are task specific 
segments, the LDT and TSS are defined by selector 
values stored in the system segment registers. 


NOTE: 
A programmer-invisible segment descriptor register 
is associated with each system segment register. 


2.1.3 FLOATING POINT REGISTERS 


Figure 2.7 shows the floating point register set. The 
on-chip FPU contains eight data registers, a tag 
word, a control register, a status register, an instruc- 
tion pointer and a data pointer. 


79 «78 64 63 0 
RO 
R1 
R2 
R3 
R4 
R5 
R6 


R7 


47 0 


Instruction Pointer 
Data Pointer 


15 0 
Control Register 
Status Register 


Tag Word 


Figure 2.7. Floating Point Registers 


The operation of the 486 microprocessor’s on-chip 
floating point unit is exactly the same as the 387 
math coprocessor. Software written for the 387 
math coprocessor will run on the on-chip floating 
point unit (FPU) without any modifications. 


2.1.3.1 Data Registers 
Floating point computations use the 486 microproc- 
essor’s FPU data registers. These eight 80-bit regis- 


ters provide the equivalent capacity of twenty 32-bit 
registers. Each of the eight data registers is divided 


15 


i486™ MICROPROCESSOR 


into “fields” corresponding to the FPU’s extended- 
precision data type. 


The FPU’s register set can be accessed either as a 
stack, with instructions operating on the top one or 
two stack elements, or as a fixed register set, with 
instructions operating on explicitly designated regis- 
ters. The TOP field in the status word identifies the 
current top-of-stack register. A “push” operation 
decrements TOP by one and loads a value into the 
new top register. A “pop” operation stores the value 
from the current top register and then increments 
TOP by one. Like other 486 microprocessor stacks 
in memory, the FPU register stack grows “down” 
toward lower-addressed registers. 


Instructions may address the data registers either 
implicitly or explicitly. Many instructions operate on 
the register at the TOP of the stack. These instruc- 
tions implicitly address the register at which TOP 
points. Other instructions allow the programmer to 
explicitly specify which register to use. This explicit 
register addressing is also relative to TOP. 


2.1.3.2 Tag Word 


The tag word marks the content of each numeric 
data register, as shown in Figure 2.8. Each two-bit 
tag represents one of the eight data registers. The 
principal function of the tag word is to optimize the 
FPUs performance and stack handling by making it 
possible to distinguish between empty and nonemp- 
ty register locations. It also enables exception han- 
dlers to check the contents of a stack location with- 
out the need to perform complex decoding of the 
actual data. 


2.1.3.3 Status Word 
The 16-bit status word reflects the overall state of 


the FPU. The status word is shown in Figure 2.9 and 
is located in the status register. 


0 


NOTE: 


The index i of tag(i) is not top-relative. A program typically uses the “top” field of Status Word to determine which tag(i) 


field refers to logical top of stack. 
TAG VALUES: 


00 
01 
10 
11 


Valid 
Zero 


Empty 


QNaN, SNaN, Infinity, Denormal and Unsupported Formats 


Figure 2.8. FPU Tag Word 
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i 
oll T TTI I |, . 
sheeted dele 


ERROR SUMMARY STATUS 
STACK FLAG 


EXCEPTION FLAGS: 
PRECISION 
UNDERFLOW 
OVERFLOW 
ZERO DIVIDE 
DENORMALIZED OPERAND 
INVALID OPERATION 


ES is set if any unmasked exception bit is set; cleared otherwise. 
See Table 2.5 for interpretation of condition code. 
TOP values: 
000 = Register 0 is Top of Stack 
001 = Register 1 is Top of Stack 
e 


411 = Register 7 is Top of Stack 


BUSY 
TOP OF STACK POINTER 
CONDITION CODE 


240440-7 


For definitions of exceptions, refer to the Section entitled 


“Exception Handling”. 


Figure 2.9. FPU Status Word 


The B bit (Busy, bit 15) is included for 8087 compati- 
bility. The B bit reflects the contents of the ES bit ett 
7 of the status word). 


Bits 13-11 (TOP) point to the FPU. poe that is 
the current top-of-stack. 


The four numeric condition code bits, CO-C3, are 
similar to the flags in EFLAGS. Instructions that per- 
form arithmetic operations update CO-—C3 to reflect 
the outcome. The effects of these instructions on 
the condition codes are summarized in Tables 2.5 
through 2.8. 
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Table 2.5. FPU Condition Code Interpretation 


FPREM, FPREM1 Three least significant bits 
(see Table 2.3) of quotient 
Q2 Q0 


Reduction — 
0 = complete 
1 = incomplete 


Q1 
or O/U# 


FCOM, FCOMP, 


FCOMPP, FTST, Result of comparison Zero Operand is not 
FUCOM, FUCOMP, (see Table 2.7) or O/U# comparable 
FUCOMPP, FICOM, (Table 2.7) 


FICOMP 


FXAM Operand class Sign Operand class 
(see Table 2.8) or O/U# (Table 2.8) 


FCHS, FABS, FXCH, 
FINCTOP, FDECTOP, 
Constant loads, | 
FXTRACT, FLD, 
FILD, FBLD, 
FSTP (ext real) 


Zero 
UNDEFINED or O/U# UNDEFINED 


FIST, FBSTP, 
FRNDINT, FST, 
FSTP, FADD, FMUL, 
FDIV, FDIVR, Roundup 
_ FSUB, FSUBR, UNDEFINED or O/U# UNDEFINED 


FSCALE, FSQRT, 
FPATAN, F2XM1, 
FYL2X, FYL2XP1 


FPTAN, FSIN Roundup Reduction 
FCOS, FSINCOS UNDEFINED or O/U#, 0 = complete 
: undefined 1 = incomplete 


ifC2 = 1 


FLDENV, FRSTOR Each bit loaded from memory | 


FINIT Clears these bits 


FLDCW, FSTENV, 


FSTCW, FSTSW, UNDEFINED 
FCLEX, FSAVE 


O/U# When both IE and SF bits of status word are set, indicating a stack exception, this bit | 
distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0). 


Reduction lf FPREM or FPREM1 produces a remainder that is less than the modulus, reduction is 
complete. When reduction is incomplete the value at the top of the stack is a partial 
remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and 
FSINCOS, the reduction bit is set if the operand at the top of the stack is too large. In this | 
case the original operand remains at the top of the stack. 


Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the 
~ instruction was upward. 


UNDEFINED Do not rely on finding any specific value in these bits. 
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Table 2.6. Condition Code Interpretation after FPREM and FPREM1 instructions | 


| Condition Code Interpretation after FPREM and FPREM1 = 


Incomplete Reduction: 
further interaction required 
for complete reduction 


Complete Reduction: 
-CO, C3, C1 contain three least 
significant bits of quotient 


TOP > Operand 
TOP < Operand 
TOP = Operand 
Unordered — 


0 0 0 0 + Unsupported 
0- 0 0 1 + NaN 

0 0 1 0 — Unsupported 
0 0 1 1 — NaN 

0 1 0 0 + Normal 

0 1 0 1 + Infinity 

0 1 1 0 — Normal 

0 1 1 1 — Infinity 

1 0 0 0 +0 

1 0 0. 1 + Empty > 

1 0- 1 0 — 0. 

1 0 1 1 — Empty 

1 1 0 0 + Denormal 

1 = 1 0 — Denormal » 
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Bit 7 is the error summary (ES) status bit. The ES bit 
is set if any unmasked exception bit (bits O-—5 in the 
status word) is set; ES is clear otherwise. The 
FERR # (floating point error) signal is asserted when 
ES is set. 


Bit 6 is the stack flag (SF). This bit is used to distin- 
guish invalid operations due to stack overflow or un- 
derflow. When SF is set, bit 9 (C1) distinguishes be- 
tween stack overflow (C1=1) and _ underflow 
(C1 =0). 


Table 2.9 shows the six exception flags in bits 0-5 
of the status word. Bits O—5 are set to indicate that 
the FPU has detected an exception while executing 
an instruction. | 


The six exception flags in the status word can be 
individually masked by mask bits in the FPU control 
word. Table 2.9 lists the exception conditions, and 
their causes in order of precedence. Table 2.9 also 
shows the action taken by the FPU if the corre- 
sponding exception flag is masked. 


An exception that is not masked by the control word 
will cause three things to happen: the corresponding 
exception flag in the status word will be set, the ES 
bit in the status word will be set and the FERR# 
output signal will be asserted. When the 486 micro- 
processor attempts to execute another floating point 
or WAIT instruction, exception 16 occurs or an exter- 
nal interrupt happens if the NE= 1 in control register 
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0. The exception condition must be resolved via an 
interrupt service routine. The FPU saves the address 
of the floating point instruction that caused the ex- 
ception and the address of any memory operand re- 
quired by that instruction in the instruction and data 
pointers (see Section 2.1.3.4). 


Note that when a new value is loaded into the status 
word by the FLDENV (load environment) or 
FRSTOR (restore state) instruction, the value of ES 
(bit 7) and its reflection in the B bit (bit 15) are not 
derived from the values loaded from memory. The 
values of ES and B are dependent upon the values 
of the exception flags in the status word and their 
corresponding masks in the control word. If ES is set 
in such a case, the FERR# output of the 486 micro- 
processor is activated immediately. 


2.1.3.4 Instruction and Data Pointers 


Because the FPU operates in parallel with the ALU 
(in the 486 microprocessor the arithmetic and logic 
unit (ALU) consists of the base architecture regis- 
ters), any errors detected by the FPU may be report- 
ed after the ALU has executed the floating point in- 
struction that caused it. To allow identification of the 
failing numeric instruction, the 486 microprocessor 
contains two pointer registers that supply the ad- 
dress of the failing numeric instruction and the ad- 
dress of its numeric memory operand (if appropri- 
ate). 


Table 2.9. FPU Exceptions 


Cause Default Action — 
| (if exception is masked) 


Invalid 
Operation 


Operation on a signaling NaN, unsupported format, 
indeterminate form (0* 0, 0/0, (+ ©) + (— °), etc.), or 


Result is a quiet NaN, integer 
indefinite, or BCD indefinite 


stack overflow/underflow (SF is also set). 


Denormalized | Atleast one of the operands is denormalized, i.e., it has Normal processing 
Operand the smallest exponent but a nonzero significand. continues 


Zero Divisor 
nonzero number. 


The divisor is zero while the dividend is a noninfinite, 


Overflow The result is too large in magnitude to fit in the specified Result is largest finite value © 
| format. or © 


Underflow 


The true result is nonzero but too small to be 


Result is denormalized or 


represented in the specified format, and, if underflow zero 
exception is masked, denormalization causes loss of 


accuracy. 


Inexact _ 
Result 


(Precision) according to the rounding mode. 


The true result is not exactly representable in the 
specified format (e.g., 1/3); the result is rounded 


Normal processing 
continues 
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The instruction and data pointers are provided for 
user-written error handlers. These registers are ac- 
cessed by the FLDENV (load _ environment), 


FSTENV (store environment), FSAVE (save state): 


~ and FRSTOR (restore state) instructions. Whenever 
the 486 microprocessor decodes a new floating 
point instruction, it saves the instruction (including 
any prefixes that may be present), the address of 
the operana (if present) and the opcode. 


The instruction and data seinter’ appear in one of 
four formats depending on the operating mode of 
the 486 microprocessor (protected mode or real-ad- 
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dress mode) and depending on the operand-size at- 
tribute in effect (32-bit operand or 16-bit operand). 


When the 486 microprocessor is in the -virtual-86. 


mode, the real address mode formats are used. The 
four formats are shown in Figures 2.10-2.13.. The 
floating point instructions FLDENV, FSTENV, 
FSAVE and FRSTOR are used to transfer these val- 
ues to and from memory. Note that the value of the 
data pointer is undefined if the prior floating point 
instruction did not have a memory operand. 


NOTE: 
The operand size attribute is the D bit ina segment 
descriptor. 


92- Sot rae MODE FORMAT 


RESERVED CONTROL WORD: 


RESERVED STATUS WORD 
RESERVED TAG WORD 


IP OFFSET 


00000 ‘OPCODE 19.0 | - “CS SELECTOR 


DATA OPERAND OFFSET 


RESERVED 7 _ OPERAND SELECTOR 


Figure 2.10. Protected Mode FPU Instruction and Data Pointer Image in Memory, 32-Bit Format 


32-BIT REAL-ADDRESS MODE FORMAT 


3 _ 23 | 7 . ) 
; 


RESERVED INSTRUCTION POINTER 15..0 
00 0 0 INSTRUCTION POINTER 31..16 Le OPCODE 10..0 
RESERVED tide. OPERAND POINTER 15..0 


0000 OPERAND POINTER 31..16 | 0000 00000000° 


Figure 2.11. Real Mode FPU Instruction and Data Pointer image in Memory, 32-Bit Format - 
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16-BIT REAL-ADDRESS MODE AND 
VIRTUAL-8086 MODE FORMAT 


15 0 
CONTROL WORD 
STATUS WORD 


16-BIT PROTECTED MODE FORMAT 


15 0 


IP OFFSET 
CS SELECTOR , 


OPERAND OFFSET 
pP19.16 0/0 000000000 ) 
OPERAND SELECTOR 
Figure 2.13. Real Mode FPU 


Figure 2.12. Protected Mode FPU instruction and Data Pointer 


instruction and Data Pointer Image in Memory, 16-Bit Format 
_ Image in Memory, 16-Bit Format 


2.1.3.5 FPU Control Word 


The FPU provides several processing options that are selected by loading a control word from memory into 
the control register. Figure 2.14 shows the format and encoding of fields in the control word. 


RESERVED 

RESERVED* 

ROUNDING CONTROL 
int PRECISION CONTROL 


RESERVED 

* "0" AFTER RESET OR FINIT: 

CHANGEABLE UPON LOADING THE 

EXCEPTION MASKS : CONTROL WORD (CW). PROGRAMS 
PRECISION MUST IGNORE THIS BIT. 


UNDERFLOW 

OVERFLOW 

ZERO DIVIDE 

DENORMALIZED OPERAND 

INVALID OPERATION 
240440-8 


Precision Control Rounding Control 
00—24 bits (single precision 00—Round to nearest or even 
01—(reserved) 01—Round down (toward — ©) 
10—53 bits (double precision) 10—Round up (toward + 0) 
11—64 bits (extended precision) 11—-Chop (truncate toward zero) 


Figure 2.14. FPU Control Word 
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The low-order byte of the FPU control word config- 
ures the FPU error and exception masking. Bits 0-5 
of the control word contain individual masks for each 
of the six exceptions that the FPU recognizes. _ 


The high-order byte of the control word configures 
the FPU operating mode, including precision and 
rounding. 


RC (Rounding Control, bits 10-11) 


‘The RC bits provide for directed rounding and 
true chop, as well as the unbiased round to 
nearest even mode specified in the IEEE stan- 
dard. Rounding control affects only those in- 

structions that perform rounding at the end of 
the operation (and thus can generate a preci- 
sion exception); namely, FST, FSTP, FIST, all 
arithmetic instructions (except FPREM, 
FPREM1, FXTRACT, FABS and FCHS), and all 
transcendental instructions. 


PC (Precision Control, bits 8-9) 


The PC bits. can be used to set the FPU internal 


operating precision of the significand at less 
than the default of 64 bits (extended precision). 
This can be useful in providing compatibility with 
early generation arithmetic processors of small- 


er precision. PC affects only the instructions | 


ADD, SUB, DIV, MUL, and SQRT. For all other 
instructions, either the precision is determined 
by the opcode or extended precision is used. 


2.1.4 DEBUG AND TEST REGISTERS 


2.1.4.1 Debug Registers 


The six programmer accessible debug registers, Fig- 
ure 2.15, provide on-chip support for debugging. De- 
bug registers DRO-3 specify the four linear break- 


points. The Debug control register DR7, is used to. 


set the breakpoints and the Debug Status Register, 
DR6, displays the current state of the breakpoints. 
The use of the Debug registers is described in Sec- 
tion 9. | 
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Debug Registers | 


LINEAR BREAKPOINT ADDRESS 0 | 


|__LINEAR BREAKPOINTADDRESS3 _| 


Test Registers 


|__CACHETEsTsTaTUs 


TLB = Translation Lookaside Buffer 


Figure 2.15 


2.1.4.2 Test Registers 


The 486 microprocessor contains five test registers. 


The test registers are shown in Figure 2.15. TR6 and 
TR7 are used to control the testing of the translation 
lookaside buffer. TR3, TR4 and TR5 are used for 
testing the on-chip cache. The use of the test regis- 
ters is discussed in Section 8. 


2.1.5 REGISTER ACCESSIBILITY 


~ There are a few differences regarding the accessibil- 


ity of the registers in Real and Protected Mode. Ta- 
ble 2.10 summarizes these differences. See Section 
4, Protected Mode Architecture, for further details. 


Use in 
Register Reai Mode 


i 
ee 


FPU Data Pointer 
. Debug Registers 


Test Registers 


[store 

: 
: 
3 
[Wo 

[No 

; 

; 

on Po 
: 


NOTES: 
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Table 2.10. Register Usage 


Use in Use in| 
Protected Mode Virtual 8086 Mode 


[toad | store | toad [Store 


| Yes 


lOPL* 


PL = 0: The registers can be accessed only when the current privilege level is zero. 
*IOPL: The PUSHF and POPF instructions are made I/O Privilege Level sensitive in Virtual 86 Mode. 


2.1.6 COMPATIBILITY 


| VERY IMPORTANT NOTE: : 
COMPATIBILITY WITH FUTURE PROCESSORS 


In the preceding register descriptions, note cer- 
tain 486 Microprocessor register bits are Intel re- 
served. When reserved bits are called out, treat 
them as fully undefined. This is essential for 
your software compatibility with future proces- 
sors! Follow the guidelines below: 


1) Do not depend on the states of any unde- 
fined bits when testing the values of defined 
register bits. Mask them out when testing. 


2) Do not depend on the states of any unde- 
fined bits when storing them to memory or 
another register. 


3) Do not depend on the ability to retain infor- 
mation written into any undefined bits. 


4) When loading registers always load the unde- 
fined bits as zeros. 


5) However, registers which have been previ- 
ously stored may be reloaded without mask- 
ing. : 


- Depending upon the values of undefined regis- 
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ter bits will make your software dependent upon 
the unspecified 486 Microprocessor handling of 
these bits. Depending on undefined values risks 


‘making your software incompatible with future 


processors that define usages for the 486 Micro- 
processor-undefined bits. AVOID ANY SOFT- 
WARE DEPENDENCE UPON THE STATE OF UN-- 
DEFINED 486 MICROPROCESSOR REGISTER 
BITS. 


intel 


2.2 Instruction Set 


The 486 microprocessor instruction set can be one 
ed into 11 categories of operations: 


Data Transfer 

Arithmetic 

Shift/Rotate 

String Manipulation — 

Bit Manipulation 

Control Transfer 

High Level Language Support 

Operating System Support 

Processor Control 

Floating Point | 

Floating Point Control 
The 486 microprocessor instructions are listed in 
Section 10. Note that all floating point unit instruc- 
tion mnemonics begin with an F. 


All 486 microprocessor instructions operate on eéi- 
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high address. Dwords are stored in four consecutive 
bytes in memory with the low-order byte at the low- 
est address, the high-order byte at the highest ad- 
dress. The address of a word or dword is the byte 
address of the low-order byte. 


In addition to these basic data types, the 486 Micro- 


processor supports two larger units of memory: 
pages and segments. Memory can be divided up 
into one or more variable length segments, which 
can be swapped to disk or shared between pro- 
grams. Memory can also be organized into one or 
more 4 Kbyte pages. Finally, both segmentation and 
paging can be combined, gaining the advantages of 
both systems. The 486 Microprocessor supports — 
both pages and segments in order to provide maxi- 
mum flexibility to the system designer. Segmentation 
and paging are complementary. Segmentation is 
useful for organizing memory in logical modules, and 


‘ as such is a tool for the application programmer, 


ther 0, 1, 2 or 3 operands; where an operand resides © 


in a register, in the instruction itself or in memory. 
Most zero operand instructions (e.g., CLI, STI) take 


only one byte. One operand instructions generally . 


are two bytes long. The average instruction is 3.2 
bytes long. Since the 486 microprocessor has a 32- 
byte instruction queue, an average of 10 instructions 
will be prefetched. The use of two operands permits 
~ the following types of common instructions: 


Register to Register 
Memory to Register 
Memory to Memory 
Immediate to Register 
Register to Memory 
Immediate to Memory 


The operands can be either 8, 16, or 32 bits long. As 


while pages are useful for the system programmer 
for managing the physical memory of a system. 


2.3.1 ADDRESS SPACES 


The 486 Microprocessor has three distinct address- 
spaces: logical, linear, and physical. A logical 
address (also known as a virtual address) consists 
of a selector and an offset. A selector is the con- 
tents of a segment register. An offset is formed by 


~ summing all of the addressing components (BASE, 


a general rule, when executing code written for the — 


486 or 386 microprocessors (32-bit code), operands 
are 8 or 32 bits; when executing existing 80286 or 
8086 code (16-bit code), operands are 8 or 16 bits. 
Prefixes can be added to all instructions which over- 
ride the default length of the operands (i.e., use 32- 
bit operands for 16- bit code, or 16-bit operands for 
32-bit code)... 


2.3 7 Organization 


Introduction 


Memory on the 486 Microprocessor is divided up 
into 8-bit quantities (bytes), 16-bit quantities (words), 
and 32-bit quantities (dwords). Words are stored in 
two consecutive bytes in memory with the low-order 
byte at the lowest address, the high order byte at the 


INDEX, DISPLACEMENT) discussed in Section 
2.5.3 Memory Addressing Modes into an effective 
address. Since each task on the 486 Microproces- 
sor has a maximum of 16K (214 —1) selectors, and 
offsets can be 4 gigabytes, (232 bits) this gives a 
total of 246 bits or 64 terabytes of logical address 
space per task. The programmer sees this virtual 
address space. 


The segmentation unit translates the logical ad- 
dress space into a 32-bit linear address space. If the 
paging unit is not enabled then the 32-bit linear ad- 
dress corresponds to the physical address. The 
paging unit translates the linear address space into 
the physical address space. The physical address 
is what appears on the address pins. 


The primary difference between Real Mode and Pro- 
tected Mode is how the segmentation unit performs 
the translation of the logical address into the linear 
address. In Real Mode, the segmentation unit shifts 


_ the selector left four bits and adds the result to the 
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offset to form the linear address. While in Protected 
Mode every selector has a linear base address as- 
sociated with it. The linear base address is stored in 
one of two operating system tables (i.e., the Local 
Descriptor Table or Global Descriptor Table). The 
selector’s linear base address is added to the offset 
to form the final linear address. 


EFFECTIVE ADDRESS CALCULATION 


ADDRESS 
LOGICAL OR 
VIRTUAL ADDRESS 


15 320 


R443 
SELECTOR | P 
| L 


SEGMENT 
REGISTER 


DESCRIPTOR 
INDEX 


Figure 2.16 shows the relationship between the vari- 
Ous address spaces. 


2.3.2 SEGMENT REGISTER USAGE 


The main data structure used to organize memory is 
the segment. On the 486 Microprocessor, segments 
are variable sized blocks of linear addresses which 
have certain attributes associated with them. There 
are two main types of segments: code and data, the 
segments are of variable size and can be as small 
as 1 byte or as large as 4 gigabytes (232 bytes). 


In order to provide compact instruction encoding, 
and increase processor performance, instructions 
do not need to explicitly specify which segment reg- 
ister is used. A default segment register is automati- 
cally chosen according to the rules of Table 2.11 
(Segment Register Selection Rules). In general, data 
references use the selector contained in the DS reg- 
ister; Stack references use the SS register and In- 
struction fetches use the CS register. The contents 
of the Instruction Pointer provide the offset. Special 
segment override prefixes allow the explicit use of a 
given segment register, and override the implicit 
rules listed in Table 2.11. The override prefixes also 
allow the use of the ES, FS and GS segment regis- 
ters. 


There are no restrictions regarding the overlapping 
of the base addresses of any segments. Thus, all 6 
segments could have the base address set to zero 


LINEAR 
ADDRESS 
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PHYSICAL 
MEMORY 


PHYSICAL 
ADDRESS 


PAGING UNIT 
(OPTIONAL USE) 


Figure 2.16. Address Translation 
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and create a system with a four gigabyte linear ad- 
dress space. This creates a system where the virtual 
address space is the same as the linear address 
space. Further details of segmentation are dis- 
cussed in Section 4.1. 3 


2.4 I/O Space 


The 486 Microprocessor has two distinct physical 
address spaces: Memory and I/O. Generally, periph- 
erals are placed in I/O space although the 486 Mi- 
croprocessor also supports memory-mapped periph- 
erals. The |/O space consists of 64 Kbytes, it can be 
divided into 64K 8-bit ports, 32K 16-bit ports, or 16K 
32-bit ports, or any combination of ports which add 
up to less than 64 Kbytes. The 64K I/O address 
space refers to physical memory rather than linear 
address since |/O instructions do not go through the 
segmentation or paging hardware. The M/IO# pin 
acts as an additional address line thus allowing the 
system designer to easily determine which address 
space the processor is accessing. 


The I/O ports are accessed via the IN and OUT I/O 
instructions, with the port address supplied as an 
immediate 8-bit constant in the instruction or in the 
DX register. All 8- and 16-bit port addresses are zero 
extended on the upper address lines. The !/O in- 
structions cause the M/IO# pin to be driven low. 


I/O port addresses OOF8H through OOFFH are re- 
served for use by Intel. | 


i486™ MICROPROCESSOR 


Table 2.11. Segment Register Selection Rules 


Code Fetch | 


Destination of PUSH, PUSHF, INT, 
. CALL, PUSHA Instructions 


Destination of STOS, MOVS, REP 
STOS, REP MOVS Instructions 
(DI is Base Register) 


- Other Data References, with 
Effective Address Using Bas 
Register of: | : 
[EAX] 
[EBX] 
[ECX] 
[EDX] 
[ESI] 
[EDI] 
[EBP] 
[ESP] 


2.5 Addressing Modes 


2.5.1 ADDRESSING MODES OVERVIEW 


The 486 Microprocessor provides a total of 11 ad- 


dressing modes for instructions to specify operands. 


The addressing modes are optimized to allow the 
efficient execution of high level languages such as C 
and FORTRAN, and they cover the vast majority of 
data references needed by high-level languages. 


2.5.2 REGISTER AND IMMEDIATE MODES 
Two of the addressing modes provide for .instruc- 


tions that operate on register or immediate oper- 
ands: | | 


Register Operand Mode: The operand is located in 
one of the 8-, 16- or 32-bit general registers. 


| Type of | - Implied (Default) 
Memory Reference Segment Use 
Source of POP, POPA, POPF, Ss 
IRET, RET instructions 


Immediate Operand Mode: The operand is includ- | 


ed in the instruction as part of the opcode. 


Segment Override 
Prefixes Possible 


2.5.3 32-BIT MEMORY ADDRESSING MODES 


The remaining 9 modes provide a mechanism for 
specifying the effective address of an operand. The 
linear address consists of two components: the seg- 
ment base address and an effective address. The 
effective address is calculated by using combina- 
tions of the following four address elements: 


DISPLACEMENT: An 8-, or 32-bit immediate value, 
following the instruction. . 


BASE: The contents of any general purpose regis- 
ter. The base registers are generally used by compil- 
ers to point to the start of the local variable area. | 


INDEX: The contents of any general purpose regis- 
ter except for ESP. The index registers are used to 
access the elements of an array, ora string of char- \ 
acters. | 


SCALE: The index register’s value can be multiplied 


by a scale factor, either 1, 2, 4 or 8. Scaled index 
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mode is especially useful for accessing arrays or 
structures. 


Combinations of these 4 components make up the 9 
additional addressing modes. There is no perform- 
ance penalty for using any of these addressing com- 
binations, since the effective address calculation is 
pipelined with the execution of other instructions. 
The one exception is the simultaneous use of Base 
and Index components which requires one addition- 
al clock. 


As shown in Figure 2.17, the effective address (EA) 
of an operand is calculated according to the follow- 
ing formula. 


EA=Base Reg+ (Index Reg * Scaling) + Displacement 


Direct Mode: The operand’s offset is contained as 
part of the instruction as an 8-, 16- or 32-bit dis- 
placement. | 
EXAMPLE: INC Word PTR [500] 


Register Indirect Mode: A BASE register contains 


the address of the operand. 
EXAMPLE: MOV [ECX], EDX 


SEGMENT REGISTER 


EFFECTIVE 
ADDRESS 


LINEAR 


D ADDRESS 


DESCRIPTOR REGISTERS 


ACCESS RIGHTS CS 
LIMIT 
BASE ADDRESS 
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Based Mode: A BASE register’s contents is added 
to a DISPLACEMENT to form the operand’s offset. 
EXAMPLE: MOV ECX, [EAX + 24] 


Index Mode: An INDEX register’s contents is added 
to a DISPLACEMENT to form the operand’s offset. 
EXAMPLE: ADD EAX, TABLE[ES!] 


Scaled Index Mode: An INDEX register’s contents is 
multiplied by a scaling factor which is added to a 
DISPLACEMENT to form the operand’s offset. 
EXAMPLE: IMUL EBX, TABLE[ESI*4],7 


Based Index Mode: The contents of a BASE register 
is added to the contents of an INDEX register to 
form the effective address of an operand. 
EXAMPLE: MOV EAX, [ESI] [EBX] 


Based Scaled Index Mode: The contents of an IN- 
DEX register is multiplied by a SCALING factor and 
the result is added to the contents of a BASE regis- 


ter to obtain the operand’s offset. 
EXAMPLE: MOV ECX, [EDX*8] [EAX] 


BASE REGISTER 
INDEX REGISTER . 


SEGMENT 


SELECTED 
SEGMENT 


SEGMENT BASE ADDRESS 
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Figure 2.17. Addressing Mode Calculations 
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Based Index Mode with Displacement: The contents 
of an INDEX Register and a BASE register’s con- 
tents and a DISPLACEMENT are all summed to- 
gether to form the operand offset. 

EXAMPLE: ADD EDX, [ESI] [EBP + OOFFFFFOH! 


' Based Scaled Index Mode with Displacement: The 
contents of an INDEX register are multiplied by a 
SCALING factor, the result is added to the contents 
of a BASE register and a DISPLACEMENT to form 
the operand’s offset. | 

EXAMPLE: MOV EAX, LOCALTABLEI[EDI*4] 
Epe eo) 


2.5.4 DIFFERENCES BETWEEN 16- AND 32-BIT 
ADDRESSES | 


In order to provide software compatibility with the 
- 80286 and the 8086, the 486 Microprocessor can 
execute 16-bit instructions in Real and Protected 
Modes. The processor determines the size of the 
instructions it is executing by examining the D bit in 
the CS segment Descriptor. If the D bit is 0 then all 
operand lengths and effective addresses are as- 
sumed to be 16 bits long. If the D bit is 1 then the 


default length for operands and addresses is 32 bits.. 


In Real Mode the default size for te and ad- 
dresses is 16-bits. . 


Regardless of the default precision of the operands 


_ or addresses, the 486 Microprocessor is able to exe- 


cute either 16- or 32-bit instructions. This is specified 


via the use of override prefixes. Two prefixes, the 


Operand Size Prefix and the Address Length Pre- 
fix, override the value of the D bit on an individual 
instruction basis. These prefixes are. automatically 
added by Intel assemblers. 


Example: The processor is executing in Real Mode © 


-and the programmer needs to access the EAX regis- 
ters. The assembler code for this might be MOV 
AX, 32-bit MEMORYOP, ASM486 Macro Assem- 
bler automatically determines that an Operand Size 
Prefix is needed and generates it. | 


Example: The D bit is 0, and the programmer wishes 
to use Scaled Index addressing mode to access an 
array. The Address Length Prefix allows the use of 

MOV DX, TABLE[ES!*2]. The assembler uses an 
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Address Length Prefix since, with D=0, the default 
addressing mode is 16-bits. 


Example: The D bit is 1, and the program wants to 


= store a 16-bit quantity. The Operand Length Prefix is 


used to specly ony: a 16- ot value; MOY MEMi16, | 
DX. 


The. OPERAND LENGTH and Address Length F Pre- 
fixes can be applied separately or in combination to 
any instruction. The Address Length Prefix does not 
allow addresses over 64 Kbytes to be accessed in 
Real Mode. A memory address which exceeds 
FFFFH will result in a General Protection Fault. An 
Address Length Prefix only allows the use of the ad- 
ditional 486 Microprocessor addressing modes. 


When executing 32-bit code, the 486 Microproces- 
sor uses either 8-, or 32-bit displacements, and any 
register can be used as base or index registers. 
When executing 16-bit code, the displacements are’ 


_ either 8, or 16 bits, and the base and index register 
conform to the 80286 model. Table 2.12 illustrates 


- Byte: 


the differences. 


2.6 Data Formats 


2.6.1 DATA TYPES 


The 486 microprocessor can support a wide-variety 
of data types. In the following descriptions, the on- 
chip floating point unit (FPU) consists of the floating 
point registers. The central processing unit (CPU) 
consists of the base architecture registers. 


2.6.1.1 Unsigned Data Types 


The FPU does not support unsigned data types. Re- 


fer to Table 2.13. 

Unsigned 8-bit quantity 
Word: Unsigned 16-bit quantity 
Dword: Unsigned 32-bit quantity 


The least significant bit (LSB) in a bye is bit O, and 
the most significant bit is 7. 


Table 2.12. BASE and INDEX Registers for 16- and 32-Bit Addresses 


| | 16 Bit Addressing | —_32-Bit Addressing 


BX,BP 
SIDI 


‘BASE REGISTER 
INDEX REGISTER: 


SCALE FACTOR | none 


0, 8, 16 bits 


Any 32-bit GP Register 

Any 32-bit GP Register 
_ Except ESP 

1,2,4,8 


DISPLACEMENT 
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0, 8, 32 bits . 


2.6.1.2 Signed Data Types 


All signed data types assume 2’s complement nota- 
tion. The signed data types contain two fields, a sign 
bit and a magnitude. The sign bit is the most signifi- 
cant bit (MSB). The number is negative if the sign bit 
is 1. If the sign bit is 0, the number is positive. The 
magnitude field consists of the remaining bits in the 
number. Refer to Table 2.13. 


8-bit Integer: Signed 8-bit quantity 

16-bit Integer: Signed 16-bit quantity 
32-bit Integer: Signed 32-bit quantity 
64-bit Integer: Signed 64-bit quantity 


The FPU only supports 16-, 32- and 64-bit integers. 
The CPU only supports 8-, 16- and 32-bit integers. 


2.6.1.3 Floating Point Data Types 


Floating point data type in the 486 microprocessor 
contain three fields, sign, significand and exponent. 
The sign field is one bit and is the MSB of the float- 
ing point number. The number is negative if the sign 
bit is 1. If the sign bit is 0, the number is positive. The 
significand gives the significant bits of the number. 
The exponent field contains the power of 2 needed 
to scale the significand. Refer to Table 2.13. 


‘Only the FPU supports floating point data types. 

Single Precision Real: 23-bit significand and 8- 
bit exponent. 32 bits total. 
52-bit significand and 11- 
bit exponent. 64 bits total. 


64-bit significand and 15- 
bit exponent. 80 bits total. 


Double Precision Real: 


Extended Precision Real: 
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2.6.1.4 BCD Data Types 


The 486 microprocessor supports packed and un- 
packed binary coded decimal (BCD) data types. A 
packed BCD data type contains two digits per byte, 
the lower digit is in bits 0-3 and the upper digit in 
bits 4-7. An unpacked BCD data type contains 1 
digit per byte stored in bits 0-3. 


The CPU supports 8-bit. packed and unpacked BCD 
data types. The FPU only supports 80-bit packed 
BCD data types. Refer to Table 2.13. 

2.6.1.5 String Data Types 

A string data type is a contiguous sequence of bits, 


bytes, words or dwords. A string may contain be- 
tween 1 byte and 4 Gbytes. Refer to Table 2.14. 


String data types are only supported by the CPU. 
Byte String: Contiguous sequence of bytes. 
Word String: Contiguous sequence of words. 
Dword String: Contiguous sequence of dwords. | 


Bit String: A set of contiguous bits. In the 486 micro- 
processor bit strings can be up to 4 gigabits long. 


2.6.1.6 ASCII Data Types 


The 486 microprocessor supports ASCII (American 
Standard Code for Information Interchange) strings 
and can perform arithmetic operations (such as ad- 
dition and division) on ASCli data. The CPU can only 
operate on ASCII data. Refer to Table 2.14. 
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Table 2.13. i486™ Microprocessor Data Types | 
Supported by Supported Y 


_ Base peosie im , a 3 | Least Significant tideg 
-|__ Data Format nnK Range [Precision PoE EEE EE Ee | 
31 23 0 
c—_ 
Sign Bit T 


0-255 |8bits — 
_ |8-Bit Integer its 
_ 116-Bit Integer 
132-Bit Integer i : 
8-Bit Unpacked BCD _|1 Digi 
79 72 
80-Bit Packed BCD se igi 
Single Precision Real its 
Double Precision Real ts 2: a. AE 
| | 79 
Extended Precision Real 


15 0 
St 0 


Two’s 
meen 


Sign Bit T 
Two": s , 
Complement 
ar. acca Bit T 


Two’s  — 
Gomplsitent 


Sign Bit T 


wd 
oO 


One BCD Digit per Byte 


N 
o 


Two BCD Digits per Byte 


Two's ‘ 
Complement 
; 0 
fe 
Sign Bit T 
Biased dass 


Sign Bit T 
T Sign Bit 
T .Sign Bit 


h 
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Table 2.14. String 


String 


Byte String 


Word String 


A+4N+3 A+4N+2 A+4N+1 A+4N 


Dword 
String 


N eee 
3} 0 31 


A+ 268,435,455 
Bit 
String 


+ 2,147,483,647 


L A+3 A+2 
T 
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and ASCII Data Types 


Data Types 


Address 


A+2N+1 A+2N 


A— 268,435,456 


0 


T 
17 — 2,147,483,648 


ASCII Data Types 


ASCII Character 


2.6.1.7 Pointer Data Types 


A pointer data type contains a value that gives the 
address of a piece of data. The 486 microprocessor 
supports two types of pointers. Refer to Table 2.15. 
48-bit Pointer: 16-bit selector and 32-bit offset 


32-bit Pointer: 32-bit offset 


Table 2.15. Pointer Data Types 


Least Sig Byte 
1 
datarormat| | | | | | | | | | 


48-Bit Pointer 


Selector 


32-Bit Pointer 
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2.6.2 LITTLE ENDIAN vs BIG ENDIAN 
DATA FORMATS 


” 


The 486 microprocessor, as well as all other mem- 
bers of the 86 architecture use the “little-endian”’ 
method for storing data types that are larger than 
one byte. Words are stored in two consecutive bytes 
in memory with the low-order byte at the lowest ad- 
dress and the high order byte at the high address. 
Dwords are stored in four consecutive bytes in mem- 
_ory with the low-order byte at the lowest address 
and the high order byte at the highest address. The 
address of a word or dword data item is the byte 
address of the low-order byte. 


Figure 2.18 ‘illustrates the differences between the 
big-endian and little-endian formats for dwords. The 
32 bits of data are shown with the low order bit num- 
bered bit 0 and the high order bit numbered 32. Big- 
endian data is stored with the high-order bits at the 
lowest addressed byte. Little-endian data is stored 


i486T™ MICROPROCESSOR 


.Hardware interrupts occur as the result of an exter- 


nal event and are classified into two types: maskable 
or non-maskable. Interrupts are serviced after the 
execution of the current instruction. After the inter- 


_rupt handler is finished servicing the interrupt, exe- 


cution proceeds with the instruction immediately 
after the interrupted instruction. Sections 2.7.3 and 
2.7.4 discuss the differences between Maskable and 
Non-Maskable interrupts. 


Exceptions are classified as faults, traps, or aborts 
depending on the way they are reported, and wheth- 
er or not restart of the instruction causing the excep- 
tion is supported. Faults are exceptions that are de- 
tected and’ serviced before the execution of the 
faulting instruction. A fault would occur in a virtual 
memory system, when the processor referenced a 
page or a segment which was not present. The oper- 
ating system would fetch the page or segment from 


_- disk, and then the 486 Microprocessor-would restart 


with the high-order bits in the highest ai | 


byte. 


The 486 microprocessor has two instructions which 
can convert 16- or 32-bit data between the two byte 


orderings. BSWAP (byte swap) handles four byte 


values and XCHG (exchange) handles two byte val- 
ues. 


Dword in Big-Endian Memory Format 


Figure 2.18. Big vs Little Endian Memory Format 


2./ Interrupts 


2.7.1 INTERRUPTS AND EXCEPTIONS 


Interrupts and exceptions alter the normal program 
flow, in order to handle external events, to report 
errors or exceptional conditions. The difference be- 


tween interrupts and exceptions is that interrupts are | 


used to handle asynchronous external events while 


exceptions handle instruction faults. Although a pro- 


gram can generate a software interrupt via an INT N 
instruction, the processor treats polwale interrupts 
as exceptions. 
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the instruction. Traps are exceptions that are report- 
ed immediately after the execution of the instruction 
which caused the problem. User defined interrupts 
are examples of traps. Aborts are exceptions which 
do not permit the precise location of the instruction 
causing the exception to be determined. Aborts are 
used to report severe errors, such as a hardware 
error, or illegal values in system tables. = 


Thus, when an interrupt service routine has been 
completed, execution proceeds from the instruction 
immediately following the interrupted instruction. On 
the other hand, the return address from an excep- 
tion fault routine will always point at the instruction 
causing the exception and include any leading in- 
struction prefixes. Table 2.16 summarizes the possi- 
ble interrupts for the 486 Microprocessor and shows 
where the return address points. 


The 486 Microprocessor has the ability to handle up 
to 256 different interrupts/exceptions. In order to 
service the interrupts, a table with up to 256 interrupt 


vectors must be defined. The interrupt vectors are 


simply pointers to the appropriate interrupt service 
routine. In Real Mode (see Section 3.1), the vectors 
are 4 byte quantities, a Code Segment plus a 16-bit 
offset; in Protected Mode, the interrupt vectors are 8 
byte quantities, which are put in an Interrupt Descrip- 
tor Table (see Section 4.3.3.4). Of the 256 possible 
interrupts, 32 are reserved for use by Intel, the re- 
maining 224 are free to be used by the ‘system de- 
signer. 


2.7.2 INTERRUPT PROCESSING 


When an interrupt occurs the following actions hap- 
pen. First, the current program address and the 
Flags are saved on the stack to allow resumption of 
the interrupted program. Next, an 8-bit vector is sup- 
plied to the 486 Microprocessor which identifies the 


itl 


appropriate entry in the interrupt table. The table 
contains the starting address of the interrupt service 
routine. Then, the user supplied interrupt service 
routine is executed. Finally, when an IRET instruc- 
tion is executed the old processor state is restored 
and program execution resumes at the appropriate 
instruction. 


The 8-bit interrupt vector is supplied to the 486 Mi- 
croprocessor in several different ways: exceptions 
supply the interrupt vector internally; software INT 
instructions contain or imply the vector; maskable 
hardware interrupts supply the 8-bit vector via the 
interrupt acknowledge bus sequence. Non-Maska- 
ble hardware interrupts are assigned to interrupt 
vector 2. 
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2.7.3 MASKABLE INTERRUPT 


Maskable interrupts are the most common way used 
by the 486 Microprocessor to respond to asynchro- 
nous external hardware events. A hardware interrupt 
occurs when the INTR is pulled high and the Inter- 


_ rupt Flag bit (IF) is enabled. The processor only re- 


sponds to interrupts between instructions, (REPeat 
String instructions, have an “interrupt window’, be- 
tween memory moves, which allows interrupts dur- 
ing long string moves). When an interrupt occurs the 
processor reads an 8-bit vector supplied by the 
hardware which identifies the source of the interrupt, 
(one of 224 user defined interrupts). The exact na- 
ture of the interrupt sequence is discussed in Sec- 
tion 7.2.10. 


Table 2.16. Interrupt Vector Assignments 


Interrupt 
Number 


Intel Reserved 


Instruction Which 


Points to . 
Can Cause ; 
: Faulting 
Exception : 
Instruction 


Segment Register Instructions 
Stack References _ 


Alignment Check Interrupt Unaligned Memory Access | FAULT 


Return Address 


*Some debug exceptions may report both traps on the previous instruction, and faults on the next instruction. 
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The IF bit in the EFLAG registers is reset when an 
interrupt is being serviced. This effectively disables 
servicing. additional interrupts during an interrupt 
service routine. However, the IF may be set explicitly 
by the interrupt handler, to allow the nesting of inter- 
rupts. When an IRET instruction is executed the 
original state of the IF is restored. 


2.7.4 NON-MASKABLE INTERRUPT | 


Non-maskable interrupts provide a method of servic- 
ing very high priority interrupts. A common example 
of the use of a non-maskable interrupt (NMI) would 
be to activate a power failure routine. When the NMI 
input is pulled high it causes an interrupt with an 
internally supplied vector value of 2. Unlike a normal 
hardware interrupt, no interrupt acknowledgment se- 
quence is performed for an NMI. 


While executing the NMI servicing procedure, the 
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2.7.6 ane AND EXCEPTION PRIORITIES — 


Interrupts are externally-generated events. Maska- 
ble Interrupts (on the INTR input) and Non-Maskable 
Interrupts (on the NMI input) are recognized at in- 
struction boundaries: When NMI and maskable 
INTR are both recognized at the same instruction 
boundary, the 486 Microprocessor invokes the NMI 
service routine first. If, after the NMI service routine 
has been invoked, maskable interrupts are still en- 
abled, then the 486 Microprocessor will invoke the 
appropriate interrupt service routine. 


Table 2.17a. i486™ Microprocessor Priority for 
_Invoking Service Routines in Case of 
Simultaneous External Interrupts 


486 Microprocessor will not service further NMI re- 


quests until an interrupt return (IRET) instruction is 
executed or the processor is reset. If NMI occurs 
while currently servicing an NMI, its presence will be 
saved for servicing after executing the first IRET in- 
struction. The IF bit is cleared at the beginning of an 
NMI interrupt to inhibit further INTR interrupts. 


2.7.5 SOFTWARE INTERRUPTS 


A third type of interrupt/exception for the 486 Micro- 
processor is the software interrupt. An INT n instruc- 
tion causes the processor to execute the interrupt 
service routine pointed to by the nth vector in the 
interrupt table. | 


A special case of the two byte software interrupt INT 
n is the one byte INT 3, or breakpoint interrupt. By 
inserting this one byte instruction in a program, the 
user can set breakpoints in his program as a debug- 
ging tool. 


Exceptions are internally-generated events. Excep- 
tions are detected by the 486 Microprocessor if, in 
the course of executing an instruction, the 486 Mi- 


croprocessor detects a problematic condition. The 


486 Microprocessor then immediately invokes the 
appropriate exception service routine. The state of 
the 486 Microprocessor is such that the instruction 
causing the exception can be restarted. If the excep- 
tion service routine has taken care of the problemat- | 
ic condition, the instruction will execute without 


causing the same exception. 


A final type of software interrupt is the single step 


interrupt. It is discussed in Section 9.2. 


It is possible for a single instruction to generate sev- 
eral exceptions (for example, transferring a single 
operand could generate two page faults if the oper- 
and location spans two “not present” pages). How- 
ever, only one exception is generated upon each at- 
tempt to execute the instruction. Each exception 
service routine should correct its corresponding ex- | 
ception, and restart the instruction. In this manner, 
exceptions are serviced until the nswUcion exe- 
cutes successfully. 


As the 486 Microprocessor executes instructions, it © 
follows a consistent cycle in checking for excep- 
tions, as shown in Table 2.17b. This cycle is repeat- 
ed as each instruction is executed, and occurs in 


parallel with instruction decoding and execution. 
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Table 2.17b. Sequence of Exception Checking 


Consider the case of the 486 Microprocessor 
having just compieted an instruction. It then per- 
forms the following checks before reaching the 
point where the next instruction is completed: 


1. Check for Exception 1 Traps from the instruc- 
tion just completed (single-step via Trap Flag, 
or Data Breakpoints set in the Debug Regis- 
ters). 


. Check for Exception 1 Faults in the next in- 
struction (Instruction Execution Breakpoint set 
in the Debug Registers for the next instruc- 
tion). 

. Check for external NMI and INTR. 


. Check for Segmentation Faults that prevented 
fetching the entire next instruction (exceptions 
11 or 13). 


. Check for Page Faults that prevented fetching 
the entire next instruction (exception 14). 


. Check for Faults decoding the next instruction 
(exception 6 if illegal opcode; exception 6 if in 
Real Mode or in Virtual 8086 Mode and at- 
tempting to execute an instruction for Protect- 
ed Mode only (see Section 4.6.4); or exception 
13 if instruction is longer than 15 bytes, or priv- 


ilege violation in Protected Mode (i.e., not at 
1OPL or at CPL =O). 


. If WAIT opcode, check if TS=1 and MP=1 
(exception 7 if both are 1). 


_If opcode for Floating Point Unit, check if 
EM=1 or TS=1 (exception 7 if either are 1). 


. If opcode for Floating Point Unit (FPU), check 
FPU error status (exception 16 if error status is 
asserted). 


10. Check in the following order for each memo- 
ry reference required by the instruction: 


a. Check for Segmentation Faults that pre- 
vent transferring the entire memory quanti- 
ty (exceptions 11, 12, 13). 


b. Check for Page Faults that prevent trans- 
ferring the entire memory quantity (excep- 
tion 14). : 


NOTE: 
The order stated supports the concept of the 
paging mechanism being “underneath” the seg- 
mentation mechanism. Therefore, for any given 
code or data reference in memory, segmenta- 
tion exceptions are generated before paging ex- 
ceptions are generated. 
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2.7.7 INSTRUCTION RESTART 


The 486 Microprocessor fully supports restarting all 
instructions after faults. If an exception is detected in 
the instruction to be executed (exception categories 
4 through 10 in Table 2.17b), the 486 Microproces- 
sor invokes the appropriate exception service rou- 
tine. The 486 Microprocessor is in a state that per- 
mits restart of the instruction, for all cases but those 
in Table 2.17c. Note that all such cases are easily 
avoided by proper design of the operating system. 


Table 2.17c. Conditions Preventing 
Instruction Restart 


An instruction causes a task switch to a task 
whose Task State Segment is partially “not 
present’. (An entirely “not present” TSS is re- 
startable.) Partially present TSS’s can be avoid- 
ed either by keeping the TSS’s of such tasks 
present in memory, or by aligning TSS segments 


to reside entirely within a single 4K page (for TSS 
segments of 4 Kbytes or less). 


NOTE: 
These conditions are avoided by using the oper- 
ating system designs mentioned in this table. 


2.7.8 DOUBLE FAULT 


A Double Fault (exception 8) results when the proc- 
essor attempts to invoke an exception service rou- 
tine for the segment exceptions (10, 11, 12 or 13), 
but in the process of doing so, detects an exception 
other than a Page Fault (exception 14). 


A Double Fault (exception 8) will also be generated 
when the processor attempts to invoke the Page 
Fault (exception 14) service routine, and detects an 
exception other than a second Page Fault. In any 
functional system, the entire Page Fault service rou- 
tine must remain “‘present” in memory. 


When a Double Fault occurs, the 486 Microproces- 
sor invokes the exception service routine for excep- 
tion 8. | 


2.7.9 FLOATING POINT INTERRUPT VECTORS 


Several interrupt vectors of the 486 microprocessor 
are used to report exceptional conditions while exe- 
cuting numeric programs in either real or protected 
mode. Table 2.18 shows these interrupts and their 
causes. 
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Interrupt | 
Number | | 


3.0 REAL MODE ARCHITECTURE 


3.1 Real Mode Introduction 


‘When the processor is reset or powered up it is ini- 
tialized in Real Mode. Real Mode has the same base 
architecture as the 8086, but allows access to the 


32-bit register set of the 486 Microprocessor. The . 


addressing mechanism, memory size, interrupt han- 
dling, are all identical to the Real Mode on the 
80286. | 


15 0 
OFFSET © . 
SEGMENT 
SELECTOR | ate 


| SEGMENT BASE 


Figure 3.1. Real Address Mode Addressing 


- Cause of Interrupt 


A Floating Point instruction was encountered when EM or TS of the 486™ processor — 
control register zero (CRO) was set. EM = 1 indicates that software emulation of the 
instruction is required. When TS is set, either a Floating Point or WAIT instruction causes 

interrupt 7. This indicates that the current FPU context may not belong to the current task. 


The first word or doubleword of a numeric operand is not entirely within the limit of its 
segment. The return address pushed onto the stack of the exception handler points at the 
Floating Point instruction that caused the exception, including any prefixes. The FPU has 
not executed this instruction; the instruction pointer and data pointer register refer to a 
previous, correctly executed instruction. | | 


The previous numerics instruction caused an unmasked exception. The address of the 
faulty instruction and the address of its operand are stored in the instruction pointer and 
data pointer registers. Only Floating Point and WAIT instructions can cause this interrupt. 
The 486™ processor return address pushed onto the stack of the exception handler 
points to a WAIT or Floating Point instruction (including prefixes). This instruction can be 
restarted after clearing the exception condition in the FPU. The FNINIT, FNCLEX, — 
FNSTSW, FNSTENV, and FNSAVE instructions cannot cause this interrupt. 


MEMORY OPERAND 


i486™ MICROPROCESSOR 


Table 2.18. Interrupt Vectors Used by FPU 


All of the 486 Microprocessor instructions are avail- 
able in Real Mode (except those instructions listed 
in Section 4.6.4).. The default operand size in Real 
Mode is 16 bits, just like the 8086. In order to use the 
32-bit registers and addressing modes, override pre- 
fixes must be used. In addition, the segment 
size on the 486 Microprocessor in Real Mode is 
64 Kbytes so 32-bit effective addresses must have a 
value less the OOOOFFFFH. The primary purpose of 
Real Mode is to set up the processor for Protected 
Mode Operation. — , 


MAX LIMIT 
FIXED AT 64K IN 
REAL MODE 


SELECTED © 
SEGMENT 


 940440-9 — 
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The LOCK prefix on the 486 Microprocessor, even in 
Real Mode, is more restrictive than on the 80286. 
This is due to the addition of paging on the 486 Mi- 
croprocessor in Protected Mode and Virtual 8086 
Mode. Paging makes it impossible to guarantee that 
repeated string instructions can be LOCKed. The 
486 Microprocessor can’t require that all pages 
holding the string be physically present in memory. 
Hence, a Page Fault (exception 14) might have to be 
taken during the repeated string instruction. There- 
fore the LOCK prefix can’t be supported during re- 
peated string instructions. 


These are the only instruction forms where the 
LOCK prefix is legal on the 486 Microprocessor: 


Operands 
(Dest, Source) 


BIT Test and Mem, Reg/immed 
SET/RESET/COMPLEMENT 
XCHG 
XCHG | 
ADD, OR, ADC, SBB, 
AND, SUB, XOR 
NOT, NEG, INC, DEC 
CMPXCHG, XADD 


Reg, Mem 
Mem, Reg 
Mem, Reg/immed 


Mem 
Mem, Reg 


An exception 6 will be generated if a LOCK prefix is 
placed before any instruction form or opcode not 
listed above. The LOCK prefix allows indivisible 
read/modify/write operations on memory operands 
using the instructions above. For example, even the 
ADD Reg, Mem is not LOCKable, because the Mem 
Operand is not the destination (and therefore no 
memory read/modify/operation is being performed). 


Since, on the 486 Microprocessor, repeated string 
instructions are not LOCKable, it is not possible to 
LOCK the bus for a long period of time. Therefore, 
the LOCK prefix is not IOPL-sensitive on the 486 
Microprocessor. The LOCK prefix can be used at 


any privilege level, but only on the instruction forms - 


listed above. 7 


3.2 Memory Addressing 


In Real Mode the maximum memory size is limited to 
1 megabyte. Thus, only address lines A2-A19 are 
active. (Exception, after RESET address lines A20- 
A31 are high during CS-relative memory cycles until 
an intersegment jump or call is executed (see Sec- 
tion 6.5)). 


Since paging is not allowed in Real Mode the linear 
addresses are the same as physical addresses. 
Physical addresses are formed in Real Mode by 
adding the. contents of the appropriate segment reg- 
ister which is shifted left by four bits to an effective 
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address. This addition results in a physical address 
from OOOOOOOOH to 0010FFEFH. This is compatible 
with 80286 Real Mode. Since segment registers are 
shifted left by 4 bits, Real Mode segments always 
start on 16 byte boundaries. 


All segments in Real Mode are exactly 64 Kbytes 
long, and may be read, written, or executed. The 486 
Microprocessor will generate an exception 13 if a 
data operand or instruction fetch occurs past the 
end of a segment (i.e., if an operand has an offset 
greater than FFFFH, for example a word with a low 
byte at FFFFH and the high byte at OO000H). 


Segments may be overlapped in Real Mode. Thus, if 
a particular segment does not use all 64 Kbytes an- 
other segment can be overlayed on top of the un- 
used portion of the previous segment. This allows 
the programmer to minimize the amount of physical 
memory needed for a program. 


3.3 Reserved Locations 


There are two fixed areas in memory which are re- 
served in Real address mode: system initialization 
area and the interrupt table area. Locations OOO000H 
through OO3FFH are reserved for interrupt vectors. 
Each one of the 256 possible interrupts has a 4-byte 
jump vector reserved for it. Locations FFFFFFFOH 
through FFFFFFFFH are reserved for system initiali- 
zation. 


3.4 Interrupts 


Many of the exceptions shown in Table 2.16 and 
discussed in Section 2.7 are not applicable to Real 
Mode operation, in particular exceptions 10, 11, 14, 
17, will not happen in Real Mode. Other exceptions 
have slightly different meanings in Real Mode; Table 
3.1 identifies these exceptions. 


3.5 Shutdown and Halt 


The HLT instruction stops program execution and 
prevents the processor from using the local bus until 
restarted. Either NMI, INTR with interrupts enabled 
(IF =1), or RESET will force the 486 Microprocessor 
out of halt. If interrupted, the saved CS:IP will point 
to the next instruction after the HLT. 


As in the case in protected mode, the shutdown will 
occur when a severe error is detected that prevents 
further processing. In Real Mode, shutdown can oc- 
Cur under two conditions: 


An interrupt or an exception occur (exceptions 8 or 
13) and the interrupt vector is larger than the Inter- 
rupt Descriptor Table (i.e., there is not an interrupt 
handler for the interrupt). 
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Table 3.1. Exceptions ale Different Meanings in n Real Mode (see Table 2. 16) : 
Interrupt - Related - Return 
Number Instructions Address Location 
ee table limit too small | - : Before fe 
Instruction 


CS, DS, ES, FS, GS Word memory reference Before 

Segment overrun exception beyond offset = FFFFH. ~ Instruction 
| An attempt to execute 7 

past the end of CS scoment 


SS S Segment o overrun exception Stack Reference Before 
ait UU offset = FFFFH nsructon 


A CALL, INT or PUSH instruction attempts to wrap tained. The main difference between Protected 


INT Vector is not 
within table limit 


around the stack segment when SP is not even (i.e., Mode, and Real Mode from a programmer’s view is 
pushing a value on the stack when SP = 0001 re- the increased address space, and a different ad- 


sulting in a stack segment greater than FFFFH). dressing mechanism. 


An NMI input can bring the processor out of shut- | 
down if the Interrupt: Descriptor Table limit is large 4.2 Addressing Mechanism 
enough to contain the NMI interrupt vector (at least | | : 
0017H) and the stack has enough room to contain = Like Real Mode, Protected Mode uses two compo- 
the vector and flag information (i.e., SP is greater nents to form the logical address, a 16-bit selector is 
than 0005H). If these conditions are not met, the used to determine the linear base address of a seg- 
i486 CPU is unable to execute the NMI and executes ment, the base address is added to a 32-bit effective 
another shutdown cycle. In this case, the processor address to form a 32-bit linear address. The linear 
remains in the shutdown and can only exit via the address is then either used as the 32-bit physical 
RESET meee | address, or if paging is enabled the paging mecha- 
nism maps the 32-bit linear address into a 32-bit 
physical address. 3 
4.0 PROTECTED MODE —2 ee 
ARCHITECTURE The difference between the two modes lies in calcu- 
; lating the base address. In Protected Mode the se- 
- | lector is used to specify an index into an operating 
4.1 Introduction | system defined table (see Figure 4.1). The table 
| | _ | | contains the 32-bit base address of a given seg- 
The complete capabilities of the 486 Microprocessor ment. The physical address is formed by adding the 
are unlocked when the processor operates in Pro- base address obtained from the table to the offset. 
tected Virtual Address Mode (Protected Mode). Pro- 
tected Mode vastly increases the linear address Paging provides an additional memory management 
space to four gigabytes (232 bytes) and allows the mechanism which operates only in Protected Mode. 


running of virtual memory programs of almost unlim- Paging provides a means of managing the very large 
ited size (64 terabytes or 246 bytes). In addition Pro- segments of the 486 Microprocessor. As such, pag- 
tected Mode allows the 486 Microprocessor to run ing operates beneath segmentation. The paging | 
all of the existing 8086, 80286 and 386 microproces- mechanism translates. the protected linear address 
sor software, while providing a sophisticated memo- which comes from the segmentation unit into a 
ry management and a hardware-assisted protection physical address. Figure 4.2 shows the complete 
mechanism. Protected Mode allows the use of addi- 486 Microprocessor addressing mechanism with 
tional instructions especially optimized for support- paging enabled. 


ing multitasking operating systems. The base archi- 
tecture of the 486 Microprocessor remains the 
same, the registers, instructions, and addressing 
modes described in the previous sections are re- 
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Figure 4.1. Protected Mode Addressing 
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Figure 4.2. Paging and Segmentation 


4.3 Segmentation | 


4.3.1 SEGMENTATION INTRODUCTION 


Segmentation is one method of memory manage- 
ment. Segmentation provides the basis for protec- 
tion. Segments are used to encapsulate regions of 
memory which have common attributes. For exam- 
ple, all of the code of a given program could be con- 
tained in a segment, or an operating system table 
may reside in a segment. All information about a 
segment is stored in an 8 byte data structure called 
a descriptor. All of the descriptors in a system are 
contained in tables recognized by hardware. 
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4.3.2 TERMINOLOGY 


The following terms are used throughout the discus- 
sion of descriptors, privilege levels and protection: 


PL: Privilege Level—One of the four hierarchical 
privilege levels. Level 0 is the most privileged level 
and level 3 is the least privileged. More privileged 
levels are numerically smaller than less privileged 
levels. 


RPL: Requestor Privilege Level—The privilege level 
of the original supplier of the selector. RPL is deter- 
mined by the least two significant bits of a selector. 


tel 


DPL: Descriptor Privilege Level—This is the least 


privileged level at which a task may access that de- 


scriptor (and the segment associated with that de- 
‘scriptor). Descriptor Privilege Level is determined by 
bits 6:5 in the Access Right Byte of a descriptor. 


CPL: Current Privilege Level—The privilege level at 


which a task is currently executing, which equals the 
privilege level of the code segment being executed. 
CPL can also be determined by examining the low- 
est 2 bits of the CS register, except for conforming 
code segments. | | 


EPL: Effective Privilege Level—The effective privi- 
lege level is the least privileged of the RPL and DPL. 
Since smaller privilege level values indicate greater 
privilege, EPL is the numerical maximum of RPL and 
DPL. | | - | 


Task: One instance of the execution of a program. — 


Tasks are also referred to as processes. 


4.3.3 DESCRIPTOR TABLES 


4.3.3.1 Descriptor Tables Introduction 


The descriptor tables define all of the segments 
which are used in an 486 Microprocessor system. 
There are three types of tables on the 486 Micro- 
processor which hold descriptors: the. Global De- 
scriptor Table, Local Descriptor Table, and the Inter- 
rupt Descriptor Table. All of the tables are variable 
length memory arrays. They can range in size be- 
tween 8 bytes and 64 Kbytes. Each table can hold 
up to 8192 8-byte descriptors. The upper 13 bits of a 
selector are used as an index into the descriptor ta- 
ble. The tables have registers associated with them 
which hold the 32-bit linear base address, and the 
16-bit limit of each table. 


Each of the tables has a register associated with it, 
the GDTR, LDTR, and the IDTR (see Figure 4.3). 
The LGDT, LLDT, and LIDT instructions, load the 


base and limit of the Global, Local, and Interrupt De- | 


scriptor Tables, respectively, into the appropriate 
register. The SGDT, SLDT, and SIDT store the base 


and limit values. These tables are manipulated by | 


the operating system. Therefore, the load descriptor 
table instructions are privileged instructions. 
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Figure 4.3. Descriptor Table Registers 


4.3.3.2 Global Descriptor Table 


The Global Descriptor Table (GDT) contains de- 
scriptors which are possibly available to all of the 
tasks in a system. The GDT can contain any type of 
segment descriptor except for descriptors which are 
used for servicing interrupts (i.e., interrupt and trap 
descriptors). Every 486 Microprocessor system con- 
tains a GDT. Generally the GDT contains code and 
data segments used by the operating systems and 
task state segments, and descriptors for the LDTs in 
a system. - 


The first slot of the Global Descriptor Table corre- 
sponds to the null selector and is not used. The null 
selector defines a null pointer value. 


4.3.3.3 Local Descriptor Table 


LDTs contain descriptors which are associated with 
a given task. Generally, operating systems are de- 
signed so that each task has a separate LDT. The 
LDT may contain only code, data, stack, task gate, 
and cail gate descriptors. LDTs provide a mecha- 
nism for isolating a given task’s code and data seg- 
ments from the rest of the operating system, while 
the GDT contains descriptors for segments which 
are common to all tasks. A segment cannot be ac- 
cessed by a task if its segment descriptor does not 
exist in either the current LDT or the GDT. This pro- 
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vides both isolation and protection for a task’s seg- 
ments, while still allowing global data to be shared 
among tasks. 


Unlike the 6 byte GDT or IDT registers which contain 
a base address and limit, the visible portion of the 
LDT register contains only a 16-bit selector. This se- 
lector refers to a Local Descriptor Table descriptor in 
the GDT. 


4.3.3.4 Interrupt Descriptor Table 


The third table needed for 486 Microprocessor sys- 
tems is the Interrupt Descriptor Table. (See Figure 
.4.4.) The IDT contains the descriptors which point to 
the location of up to 256 interrupt service routines. 
The IDT may contain only task gates, interrupt 
gates, and trap gates. The IDT should be at least 
256 bytes in size in order to hold the descriptors for 
the 32 Intel Reserved Interrupts. Every interrupt 
used by a system must have an entry in the IDT. The 
IDT entries are referenced via INT instructions, ex- 
ternal interrupt vectors, and exceptions. (See Sec- 
tion 2.7 Interrupts). 


av 
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4.3.4 DESCRIPTORS 


4.3.4.1 Descriptor Attribute Bits 


The object to which the segment selector points to 
is called a descriptor. Descriptors are eight byte 
quantities which contain attributes about a given re- 
gion of linear address space (i.e., a segment). These 
attributes include the 32-bit base linear address of 
the segment, the 20-bit length and granularity of the 
segment, the protection level, read, write or execute 
privileges, the default size of the operands (16-bit or 
32-bit), and the type of segment. All of the attribute 
information about a segment is contained in 12 bits 
in the segment descriptor. Figure 4.5 shows the gen- 
eral format of a descriptor. All segments on the 486 
Microprocessor have three attribute fields in com- 
mon: the P bit, the DPL bit, and the S bit. The Pres- 
ent P bit is 1 if the segment is loaded in physical 
memory, if P=O then any attempt to access this 
segment causes a not present exception (exception 
11). The Descriptor Privilege Level DPL is a two-bit 
field which specifies the protection level O—3 associ- 
ated with a segment. 


The 486 Microprocessor has two main categories of 
segments: system segments and non-system seg- 
ments (for code and data). The segment §S bit in the 
segment descriptor determines if a given segment is 
a system segment or a code or data segment. If the 
S bit is 1 then the segment is either a code or data 
segment, if it is O then the segment is a system seg- 
ment. | 


4.3.4.2 i486™ CPU Code, Data Descriptors 
(S= 1) 


Figure 4.6 shows the general format of a code and 
data descriptor and Table 4.1 illustrates how the bits 


_ in the Access Rights Byte are interpreted. 
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BYTE 
ADDRESS 
90 


SEGMENT BASE 15. SEGMENT LIMIT 15. 


+4 


Base Address of the segment 
The length of the segment 
Present Bit 1=Present O=Not Present 
Descriptor Privilege Level 0-3 
. Segment Descriptor O=System Descriptor 1=Code or Data Segment Descriptor. 
Type of Segment — as 
Accessed Bit . : 
Granularity Bit +1=Segment length is page granular O=Segment length is byte granular 
Default Operation Size (recognized in code segment descriptors only) | 
1=32-bit segment O= 16-bit segment . 
. Bit must be zero (0) for compatibility with future processors 
AVL Available field for user or OS 


NOTE: | 
In a maximum-size segment (i.e., a soameat with G=1 and segment limit 19...0= FFFFFH), the lowest 12 bits of the 
segment base should be zero (i.e., segment base 11...000 = 000H). - 


Figure 4.5. Segment Descriptors 


SEGMENT BASE 15... 0 _ _ | SEGMENT LIMIT 15...0 


| ACCESS. 
BASE 31...24 | a RIGHTS | 
: ai | peace BYTE 


‘1 = Default Instruction Attributes are 32-Bits 
0= Default Instruction Attributes are 16-Bits 
Available field for user or OS 
Granulanity Bit 1=Segment length is page granular 
. 0=Segment length is byte granular 
_ Bit must be zero (0) for compatibility with future processors 


Figure 4.6. Segment Descriptors 
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Table 4.1. Access Rights Byte Definition for Code and Data Descriptions 


| Segment is mapped into physical memory. 
No mapping to physical memory exits, base and limit are 
not used. 


Descriptor Privilege 
Level (DPL) 
Segment Descrip- 


Segment privilege attribute used in privilege tests. 


Code or Data (includes stacks) segment descriptor. 


System Segment Descriptor or Gate Descriptor. 


Executable (E) 
Expansion Direc- 


Executable (E) 
Conforming (C) 


Readable (R) 


Descriptor type is data segment: 
ED = 0 Expand up segment, offsets must be < limit. 
ED = 1 Expand down segment, offsets must be > limit. 
W = 0 Data segment may not be written into. 

Data segment may be written into. 


Descriptor type is code segment: 
Code segment may only be executed 
when CPL = DPL and CPL 

remains unchanged. 

Code segment may not be read. 


if 

Data 
Segment 
(S = 1, 
E = 0) 


If 

Code 
Segment 
(S = 1, 
E = 1) 


Code segment may be read. 


Accessed (A) 


Segment has not been accessed. 


Segment selector has been loaded into segment register 
or used by selector test instructions. 


Code and data segments have several descriptor 
fields in common. The accessed A bit is set whenev- 
er the processor accesses a descriptor. The A bit is 
used by operating systems to keep usage statistics 
on a given segment. The G bit, or granularity bit, 
specifies if a segment length is byte-granular or 
page-granular. 486 Microprocessor segments can 
be one megabyte long with byte granularity (G=0) 
or four gigabytes with page granularity (G= 1), (i.e., 
220 pages each page is 4 Kbytes in length). The 
granularity is totally unrelated to paging. A 486 Mi- 
croprocessor system can consist of segments with 
byte granularity, and page granularity, whether or not 
* paging is enabled. 


The executable E bit tells if a segment is a code or 
data segment. A code segment (E= 1, S= 1) may be 
execute-only or execute/read as determined by the 
Read R bit. Code segments are execute only if 
R=0, and execute/read if R=1. Code segments 
may never be written into. 


NOTE: 
Code segments may be modified via aliases. Alias- 
es are writeable data segments which occupy the 
same range of linear address space as the code 
segment. 
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The D bit indicates the default length for operands | 
and effective addresses. If D=1 then 32-bit oper- 
ands and 32-bit addressing modes are assumed. If 
D=0 then 16-bit operands and 16-bit addressing 
modes are assumed. Therefore all existing 80286 
code segments will execute on the 486 Microproc- 
essor assuming the D bit is set 0. 


Another attribute of code segments is determined by 
the conforming C bit. Conforming segments, C= 1, 
can be executed and shared by programs at differ- 
ent privilege levels. (See Section 4.4 Protection.) 


Segments identified as data segments (E=0, S= 1) 
are used for two types of 486 Microprocessor seg- 
ments: stack and data segments. The expansion di- 
rection (ED) bit specifies if a segment expands 
downward (stack) or upward (data). If a segment is a 
stack segment all offsets must be greater than the 
segment limit. On a data segment all offsets must be 
less than or equal to the limit. In other words, stack 
segments start at the base linear address plus the - 
maximum segment limit and grow down to the base 
linear address plus the limit. On the other hand, data 
segments start at the base linear address and ex- 
pand to the base linear address plus limit. 


nie 


The write W bit controls the ability to write into a 


segment. Data segments are read-only if W=0. The 
stack segment must have W= 1. 


The B bit controls the size of the stack pointer regis- 
ter. If B= 1, then PUSHes, POPs, and CALLs all use 
the 32-bit ESP register for stack references and as- 
sume an upper limit of FFFFFFFFH. If B=0, stack 
instructions all use the 16-bit SP register and as- 
sume an upper limit of FFFFH. 


4.3.4.3 System Descriptor Formats 


System segments describe information about oper- 
ating system tables, tasks, and gates. Figure 4.7 
shows the general format of system segment de- 
scriptors, and the various types of system segments. 
486 Microprocessor system descriptors contain a 


32-bit base linear address and a 20-bit segment lim-_ 


it. 80286 system descriptors have a 24-bit base ad- 
dress and a 16-bit segment limit. 80286 system de- 
scriptors are identified by the upper 16 bits being all 
zero. 


4.3.4.4 LDT Descriptors (S=0, TYPE = 2) 


LDT descriptors (S=0, TYPE=2) contain informa- 
tion about Local Descriptor Tables. LDTs contain a 
table of segment descriptors, unique to a particular 
task. Since the instruction to load the LDTR is only 
available at privilege level 0, the DPL field is ignored. 
LDT descriptors are only allowed in the Global De- 
ta Table (GDT). 


SEGMENT BASE 15. 


“Defines 


invalid 
Available 80286 TSS 
LDT 
_ Busy 80286 TSS 
80286 Call Gate . 
Task Gate (for 80286 or 486™ CPU Task) 
80286 Interrupt Gate 
80286 Trap Gate 


-—j 
< 
ao] 

@ 


0 
1 
2 
3 
4. 
5 
6 
7 
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4.3.4.5 TSS Descriptors (S=0, . 


TYPE = 1, 3, 9, B) 
A Task State Segment (TSS) descriptor contains in- 


_formation about the location, size, and privilege level 


of a Task State Segment (TSS). A TSS in turn is a 
special fixed format segment which contains all the 
state information for a task and a linkage field to © 
permit nesting tasks. The TYPE field is used to indi- 
cate whether the task is currently BUSY (i.e., on a 
chain of active tasks) or the TSS is available. The 
TYPE field also indicates if the segment contains a 
80286 or a 486 Microprocessor TSS. The Task Reg- 
ister (TR) contains the selector which points to the 


current Task State Segment. 


4.3.4.6 Gate Descriptors (S=0, 


TYPE = 4-7, C, F) 


Gates are used to control access to entry points 
within the target code segment. The various types of 


gate descriptors are call gates, ‘task gates, 


interrupt gates, and trap gates. Gates provide a 
level of indirection between the source and destina- 


tion of the control transfer. This indirection allows 
-the processor to automatically perform protection 


checks. It also allows system designers to control 


_ entry points to the operating system. Call gates are 


used to change privilege levels (see Section 4.4 
Protection), task gates are.used to perform a task 
switch, and: interrupt and trap gates are used to 


_ specify interrupt service routines. 


SEGMENT LIMIT 15. 


—— Signor: = cele fee LoL se Pc 


Defines 


Invalid 
Available 486™ CPU TSS 
Undefined (intel Reserved) 
Busy 486™ CPU TSS 
486™ CPU Cail Gate 
Undefined (Intel Reserved) 
486™ CPU Interrupt Gate 
- 486™ CPU Trap Gate 
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Figure 4.8 shows the format of the four types of gate 
descriptors. Call gates are primarily used to transfer 
program control to a more privileged level. The call 
gate descriptor consists of three fields: the access 
byte, a long pointer (selector and offset) which 
points to the start of a routine and a word count 
which specifies how many parameters are to be cop- 
ied from the caller’s stack to the stack of the called 
routine. The word count field is only used by call 
gates when there is a change in the privilege level, 
other types of gates ignore the word count field. 


Interrupt and trap gates use the destination selector 
and destination offset fields of the gate descriptor as 
a pointer to the start of the interrupt or trap handler 
routines. The difference between interrupt gates and 
trap gates is that the interrupt gate disables inter- 
rupts (resets the IF bit) while the trap gate does not. 


Task gates are used to switch tasks. Task gates 
may only refer to a task state segment (see Section 
4.4.6 Task Switching) therefore only the destination 
selector portion of a task gate descriptor is used, 
and the destination offset is ignored. 


Exception 13 is generated when a destination selec- 
tor does not refer to a correct descriptor type, i.e., a 
code segment for an interrupt, trap or call gate, a 
TSS for a task gate. 
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The access byte format is the same for all gate de- 
scriptors. P=1 indicates that the gate contents are 
valid. P=0 indicates the contents are not valid and 
causes exception 11 if referenced. DPL is the de- 
scriptor privilege level and specifies when this de- 
scriptor may be used by a task (see Section 4.4 
Protection). The S field, bit 4 of the access rights 
byte, must be 0 to indicate a system control descrip- 
tor. The type field specifies the descriptor type as 


_ indicated in Figure 4.8. 


4.3.4.7 Differences Between i486™ 
Microprocessor and 80286 Descriptors 


In order to provide operating system compatibility 
between the 80286 and 486 Microprocessor, the 
486 Microprocessor supports all of the 80286 seg- 
ment descriptors. Figure 4.9 shows the general for- 
mat of an 80286 system segment descriptor. The 
only differences between 80286 and 486 Microproc- 
essor descriptor formats are that the values of the 
type fields, and the limit and base address fields 
have been expanded for the 486 Microprocessor. 
The 80286 system segment descriptors contained a 
24-bit base address and 16-bit limit, while the 486 
Microprocessor system segment descriptors have a 
32-bit base address, a 20-bit limit field, and a granu- 
larity bit. 


SELECTOR | OFFSET 15. 


OFFSET 31... 16 


Gate Descriptor Fields 


< 
2 
< 
© 


80286 call gate 


80286 interrupt gate 
80286 trap gate 
486™ CPU call gate 


486™ CPU trap gate 
Pp 


4 
5 
6 
7 
C 
E 
F 
0 
1 


Description 


486™ CPU interrupt gate 


Task gate (for 80286 or 486™ CPU task) 


Descriptor contents are not valid 
Descriptor contents are valid 


DPL—least privileged level at which a task may access the gate. WORD COUNT 0-31—the number of parameters to copy from caller’s stack 
to the called procedure’s stack. The parameters are 32-bit quantities for 486™ CPU gates, and 16-bit quantities for 80286 gates. 


DESTINATION 16-bit 


SELECTOR selector or 


DESTINATION offset 
OFFSET 16-bit 80286 
32-bit 486™ CPU 


Selector to the target code segment 
Selector to the target task state segment for task gate 


Entry point within the target code segment 


Figure 4.8. Gate Descriptor Formats 
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By supporting 80286 system segments the 486 Mi- 
- croprocessor is able to execute 80286 application 
programs on a 486.Microprocessor operating sys- 
tem. This is possible because the processor auto- 
matically understands which descriptors are 80286- 
style descriptors and which descriptors are 486 Mi- 
croprocessor-style descriptors. In particular, if the 
upper word of.a descriptor is zero, then that aucesclbs 
tor is a 80286-style descriptor. _ 


The only other differences between 80286-style de- 
scriptors and 486 Microprocessor descriptors is the 
_ interpretation of the word count field of call gates 

and the B bit. The word coiint field specifies the 
number of 16-bit quantities to copy for 80286 call 
gates and 32-bit quantities for 486 Microprocessor 
call gates. The B bit controls the size of PUSHes 
when using a call gate; if B=0 PUSHes are 16 bits, 
if B= 1 PUSHes are 32 bits. 


4. 3.4.8 Selector Fields | 


A selector in Protected Mode has three fields: Local 
or Global Descriptor Table Indicator (Tl), Descriptor 
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Entry Index (Index), and Requestor (the selector’s) 
Privilege Level (RPL) as shown in Figure 4.10. The 
TI bits select one of two memory-based tables of 


- descriptors (the Global Descriptor Table or the Local 


Descriptor Table). The Index selects one of 8K de- 
scriptors in the appropriate descriptor table. The 
RPL bits. allow high speed ane of the selector’ Ss 
Priege attributes. 


4.3.4.9 Begin Descriptor Cache — 


In addition to the selector value, every segment reg- 
ister has a segment descriptor cache register asso- 
ciated with it. Whenever a segment register’s con- 
tents are changed, the 8-byte descriptor associated 
with that selector is automatically loaded (cached) 
on the chip. Once loaded, all references to that seg- 
ment use the cached descriptor information instead 
of reaccessing the descriptor. The contents of the 
descriptor cache are not visible to the programmer. 
Since descriptor caches only change when a seg- 
ment register is changed, programs which modify 


_the descriptor tables must reload the appropriate 


segment registers after changing a descriptor’s val- 
ue. . | | 


SEGMENT BASE 15. SEGMENT LIMIT 15. 


intel Reserved Prien 
ort 


BASE Base Address of the segment 
LIMIT, The length of the segment 


Descriptor Privilege Level 0-3 
System Descriptor O=System 1=User 


P Present Bit. 1=Present O=Not Present 


ee Type of Segment 


Figure 4.9. 80286 Code and Data Segment Descriptors 
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SELECTOR 


432 


SEGMENT 
REGISTER 


DESCRIPTOR 
TABLE 


1 0 


i486™ MICROPROCESSOR 


DESCRIPTOR 
NUMBER 


GLOBAL 
DESCRIPTOR 


TABLE 
240440-14 


Figure 4.10. Example Descriptor Selection 


4.3.4.10 Segment Descriptor Register Settings 


The contents of the segment descriptor cache vary 
depending on the mode the 486 Microprocessor is 
operating in. When operating in Real Address Mode, 
the segment base, limit, and other attributes within 
the segment cache registers are defined as shown 
in Figure 4.11. For compatibility with the 8086 archi- 
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tecture, the base is set to sixteen times the current 
selector value, the limit is fixed at OOOOFFFFH, and 
the attributes are fixed so as to indicate the segment 
is present and fully usable. In Real Address Mode, 
the internal “privilege level’ is always fixed to the 
highest level, level 0, so I/O and other privileged 
opcodes may be executed. _ 


intel \486™ MICROPROCESSOR 


SEGMENT DESCRIPTOR CACHE REGISTER CONTENTS 


32- BIT BASE 32 = BIT LIMIT OTHER ATTRIBUTES 


(UPDATED DURING SELECTOR _ (FIXED) (FIXED) — 
LOAD INTO SEGMENT REGISTER) 


CONFORMING PRIVILEGE 


240440-15 


*Except the 32-bit CS base is initialized to FFFFFOOOH after reset until first intersegment contro! transfer (i.e., intersegment CALL, or 

intersegment JMP, or INT). (See Figure 4.13 Example.) . 
Key: = yes 

= no 

= privilege level 0 

privilege level 1 

privilege level 2 

privilege level 3 

expand up 


= expand down 
= byte granularity 

= page granularity 

= push/pop 16-bit words 

push/pop 32-bit dwords 

does not apply to that segment cache register 


ins vHOO 


fou ude 


CON-"OCO2< 


Figure 4.11. Segment Descriptor Caches for Real Address Mode 
(Segment Limit and Attributes are Fixed) 


When operating in Protected Mode, the segment according to the contents of the segment descriptor 
base, limit, and other attributes within the segment indexed by the selector value loaded into the seg- 
cache registers are defined as shown in Figure 4.12. ment register. ~. | 
In Protected Mode, each of these fields are defined 


5-56 


i486™ MICROPROCESSOR 


SEGMENT 


32 —- BIT BASE 


(UPDATED DURING 
SELECTOR LOAD INTO 
SEGMENT REGISTER) 


CONFORMING PRIVILEGE 
STACK SIZE 
EXECUTABLE 
WRITEABLE 

READABLE 

EXPANSION DIRECTION 
GRANULARITY 
ACCESSED 

PRIVILEGE LEVEL 


DESCRIPTOR CACHE REGISTER CONTENTS 


32 = BIT LIMIT 


(UPDATED DURING 
SELECTOR LOAD INTO 
SEGMENT REGISTER) 


OTHER ATTRIBUTES 


(UPDATED DURING 
SELECTOR LOAD INTO 
SEGMENT REGISTER) 


BASE PER SEG DESCR LIMIT PER SEG DESCR 


240440--16 


fixed yes 
fixed no 
per segment descriptor 
= per segment descriptor; descriptor must indicate “‘present’’ to avoid exception 11 

(exception 12 in case of SS) . 

r = per segment descriptor, but descriptor must indicate “readable” to avoid exception 13 
(special case for SS) 

w = per segment descriptor, but descriptor must indicate “writable” to avoid exception 13 
(special case for SS) 

— = does not apply to that segment cache register 


Figure 4:12. Segment Descriptor Caches for Protected Mode (Loaded per Descriptor) 


When operating in a Virtual 8086. Mode within the 
Protected Mode, the segment base, limit, and other 
attributes within the segment cache registers are de- 
fined as shown in Figure 4.13. For compatibility with 
the 8086 architecture, the base is set to sixteen 
times the current selector value, the limit is fixed at 
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OOOOFFFFH, and the attributes are fixed so as to 
indicate the segment is present and fully usable. The 
virtual program executes at lowest privilege level, 
level 3, to allow trapping of all |OPL-sensitive in- 
structions and level-0-only instructions. 
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SEGMENT DESCRIPTOR CACHE REGISTER CONTENTS 
32='BIT BASE | : 32 = BIT LIMIT OTHER ATTRIBUTES 


(UPDATED DURING SELECTOR (FIXED) (FIXED) 
LOAD INTO SEGMENT REGISTER) : as 


CONFORMING PRIVILEGE 
STACK SIZE 
EXECUTABLE 
WRITEABLE 
READABLE 
EXPANSION DIRECTION 
GRANULARITY 

ACCESSED 
PRIVILEGE LEVEL 


[16x CURRENT SS SELECTOR | _OOOOFFFFH 
“OO0OFFFFH_[¥ 
“o00rFFFH |Y| 


: OOOOFFFFH 


240440-17 


Key: yes 

no 

privilege level 0 
privilege level 1 
privilege level 2 
privilege level 3 


expand up 


expand down 

byte granularity 

page granularity 

= push/pop 16-bit words 

= push/pop 32-bit dwords 

= does not apply to that segment cache register 


CWON-OZ7~< 
Ins vHw)8 
ow 


A fb dt ot tea 


| Figure 4.13. Segment Descriptor Caches for Virtual 8086 Mode within Protected Mode 
(Segment Limit and Attributes are Fixed) 


\ 


4.4 Protection | : | The 486 Microprocessor has four levels of protec- 

| | tion which are optimized to support the needs of a 

Bh a } | | ‘multi-tasking operating system to isolate and protect 

4.4.1 PROTECTION CONCEPTS ‘user programs from each other and the operating 

| —_ system. The privilege levels control the use of privi- 

| EE leged instructions, |/O instructions, and access to 

cpu segments and segment descriptors. Unlike tradition- 

SOFTWARE. al microprocessor-based systems where this protec- 

mre tion is achieved only through the use of complex 

external hardware and software the 486 Microproc- 

essor provides the protection as part of its integrat- 

ed Memory Management Unit. The 486 Microproc- 

essor offers an additional type of protection on a 

pesiel seg | _ page basis, when paging is enabled (See Section 
se cnpace 4.5.3 Page Level Protection). 


OS EXTENSIONS 


The four-level hierarchical privilege system is illus- 
trated in Figure 4-14. It is an extension of the user/ 
supervisor privilege mode commonly used by mini- 
} - computers and, in fact, the user/supervisor mode is 
240440-18 fully supported by the 486 Microprocessor paging 


Figure 4.14. Four-Level Hierarchical Protection 
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mechanism. The privilege levels (PL) are numbered 
0 through 3. Level 0 is the most privileged or trusted 
level. 


4.4.2 RULES OF PRIVILEGE 


The 486 Microprocessor controls access to both 
data and procedures between levels of a task, ac- 
cording to the following rules. 


¢ Data stored in a segment with privilege level p can 
be accessed only by code executing at a privilege 
level at least as privileged as p. 


e A code segment/procedure with privilege level p 
can only be called by a task executing at the same 
or a lesser privilege level than p. 


4.4.3 PRIVILEGE LEVELS 


4.4.3.1 Task Privilege 


At any point in time, a task on the 486 Microproces- 
sor always executes at one of the four privilege lev- 
els. The Current Privilege Level (CPL) specifies the 
task’s privilege level. A task’s CPL may only be 
changed by control transfers through gate descrip- 
tors to a code segment with a different privilege lev- 
el. (See Section 4.4.4 Privilege Level Transfers) 
Thus, an application program running at PL = 3 may 
call an operating system routine at PL = 1 (via a 
gate) which would cause the task’s CPL to be set to 
1 until the operating system routine was finished. 


4.4.3.2 Selector Privilege (RPL) 


The privilege level of a selector is specified by the 
RPL field. The RPL is the two least significant bits of 
the selector. The selector’s RPL is only used to es- 
tablish a less trusted privilege level than the current 
privilege level for the use of a segment. This level is 
called the task’s effective privilege level (EPL). The 
EPL is defined as being the least privileged (i.e. nu- 
merically larger) level of a task’s CPL and a selec- 
tor’s RPL. Thus, if selector’s RPL = 0 then the CPL 
always specifies the privilege level for making an ac- 
cess using the selector. On the other hand if RPL = 
3 then a selector can only access segments at level 
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3 regardless of the task’s CPL. The RPL is most 
commonly used to verify that pointers passed to an 
operating system procedure do not access data that 
is of higher privilege than the procedure that origi- 
nated the pointer. Since the originator of a selector 
can specify any RPL value, the Adjust RPL (ARPL) 
instruction is provided to force the RPL bits to the 
Originator’s CPL. 


4.4.3.3 1/O Privilege and |/O Permission Bitmap 


The |/O privilege level (IOPL, a 2-bit field in the 
EFLAG register) defines the least privileged level at 
which !/O instructions can be unconditionally per- 
formed. |/O instructions can be unconditionally per- 
formed when CPL < IOPL. (The I/O instructions are 
IN, OUT, INS, OUTS, REP INS, and REP OUTS.) 
When CPL > IOPL, and the current task is associat- 
ed with a 286 TSS, attempted I/O instructions cause 
an exception 13 fault. When CPL > IOPL, and the 
current task is associated with a 486 Microprocessor 
TSS, the 1/O Permission Bitmap (part of a 486 Mi- 
croprocessor TSS) is consulted on whether I/O to 
the port is allowed, or an exception 13 fault is to be 
generated instead. For diagrams of the I/O Permis- 
sion Bitmap, refer to Figures 4.15a and 4.15b. For 
further information on how the I/O Permission Bit- 
map is used in Protected Mode or in Virtual 8086 
Mode, refer to Section 4.6.4 Protection and I/O Per- 
mission Bitmap. 


The I/O privilege level (IOPL) also affects whether 
several other instructions can be executed or cause 
an exception 13 fault instead. These instructions are 
called “lOPL-sensitive” instructions and they are 
CLI and STI. (Note that the LOCK prefix is not |OPL- 
sensitive on the 486 Microprocessor.) 


The IOPL also affects whether the IF (interrupts en- 
able flag) bit can be changed by loading a value into 
the EFLAGS register. When CPL < IOPL, then the 
IF bit can be changed by loading a new value into 
the EFLAGS register. When CPL > IOPL, the IF bit 
cannot be changed by a new value POP’ed into (or 
otherwise loaded into) the EFLAGS register; the IF 
bit merely remains unchanged and no exception is 
generated. | ~ a 


Table 4.2. Pointer Test Instructions 


Selector, 
Register 


| Function 


Adjust Requested Privi- 
lege Level: adjusts the 
/RPL of the selector to the 
numeric maximum of 
current selector RPL value 
and the RPL value in the 
register. Set zero flag if 
selector RPL was 

| changed. - 


VERify for Read: sets the 
zero flag if the segment 


Selector 


referred to by the selector | 


can be read. 


VERify for Write: sets the 
zero flag if the segment 
| referred to by the selector 
can be written. 


Register, — 

Selector | the segment limit into the 

| register if privilege rules 
and descriptor type allow. 


Set zero flag if successful. 


Register, 
-| Selector 


Load Access Rights: reads 
the descriptor access 
rights byte into the register 
if privilege rules allow. Set 
zero flag if successful. — 


4.4, 3. 4 Privilege Validation 


The 486 Microprocessor rela several instruc- 
tions to speed pointer testing and help maintain sys- 
tem integrity by verifying that the selector value 
refers to an appropriate segment. Table 4.2 summa- 
rizes the selector validation procedures available for 
the 486 Microprocessor. 


This pointer verification prevents the common oreb: 
lem of an application at PL = 3 calling a operating 
systems routine at PL = 0.and passing the operat- 
ing system routine a “bad” pointer which corrupts a 
data structure belonging to the operating system. If 
the operating system routine uses the ARPL instruc- 


Load Segment Limit: reads} 
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tion to ensure that the RPL of the selector has no 
greater privilege than that of the caller, then this 
problem can be avoided. 


4.4.3.5 Descriptor Access 


There are basically two types of segment accesses: 
those involving code segments such as control 
transfers, and those involving data accesses. Deter- 
mining the ability of a task to access a segment in- 
volves the type of segment to be accessed, the in- 
struction used, the type of descriptor used and CPL, 
RPL, and DPL as described above. 


Any time an instruction loads data segment registers 
(DS, ES, FS, GS) the 486 Microprocessor makes 
protection validation checks. Selectors loaded in the 
DS, ES, FS, GS registers must refer only to data 
segments or readable code segments. The data ac- 
cess rules are specified in Section 4.4.2 Rules of 
Privilege. The only exception to those rules is read- 
able conforming code segments which can be ac- 
cessed at any privilege level. 


Finally the privilege validation checks are seo 
The CPL is compared to the EPL and if the EPL is 
more privileged than the CPL an exception 13 (gen- 
eral protection fault) is generated. 


The rules regarding the stack segment are Slightly 


different than those involving data segments. In- 
structions that load selectors into SS must refer to 
data segment descriptors for writeable data seg- 
ments. The DPL and RPL must equal the CPL. All 
other descriptor types or a privilege level violation 
will cause exception 13. A stack not present fault 
causes exception 12. Note that an exception 11 is 
used for a not-present code or data segment. 


4.4.4 PRIVILEGE LEVEL TRANSFERS | 
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Inter-segment control transfers occur when a selec- 
tor is loaded in the CS register. For a typical system 
most of these transfers are simply the result of a call 
or a jump to another routine. There are five types of 
control transfers which are summarized in Table 4.3. 
Many of these transfers result in a privilege level 
transfer. Changing privilege levels is done only via 
control transfers, by using gates, task switches, and 
interrupt or trap gates. 
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Table 4.3. Descriptor Types Used for Control Transfer 


: Descriptor Descriptor 
Control Transfer Types Operation Types Baferenced Table 


Intersegment within the same privilege level JMP, CALL, RET, IRET* | Code Segment | GDT/LDT 
Intersegment to the same or higher privilege level | CALL Call Gate GDT/LDT 


Interrupt within task may change CPL Interrupt Instruction, Trap or 
| Exception, External Interrupt 
i Interrupt Gate 
Intersegment to a lower privilege level RET, IRET* Code Segment | GDT/LDT 
(changes task CPL) 
CALL, JMP Task State 
Segment 


a CALL, JMP GDT/LDT 


IDT 
GDT 
= 


Interrupt Instruction, 
Exception, External 
Interrupt 


*NT (Nested Task bit of flag register) = 0 
**NT (Nested Task bit of flag register) = 1 


Control transfers can only occur if the operation — Return instructions that do not switch tasks can 
which loaded the selector references the correct de- only return control to a code segment with same 
scriptor type. Any violation of these descriptor usage or less privilege. 

rules will cause an exception 13 (e.g. JMP through a — Task switches can be performed by a CALL, 


call gate, or IRET from a normal subroutine call). JMP, or INT which references either a task gate 


an or task state segment who’s DPL is less privi- 
In order to provide further system security, all control leged or the same privilege as the old task’s CPL. 
transfers are also subject to the privilege rules. 


Any control transfer that changes CPL within a task 


The privilege rules require that: causes a change of stacks as a result of the privi- 


— Privilege level transitions can only occur via lege level change. The initial values of SS:ESP for 
gates. privilege levels 0, 1, and 2 are retained in the task 
— JMPs can be made to a non-conforming code state segment (see Section 4.4.6 Task Switching). 
segment with the same privilege or to a conform- During a JMP or CALL control transfer, the new 
ing code segment with greater or equal privilege. stack pointer is loaded into the SS and ESP regis- 


ters and the previous stack pointer is pushed onto 


— CALLs can be made to a non-conforming code the new stack. 


segment with the same privilege or via a gate toa 


more privileged level. When RETurning to the original privilege level, use 
— Interrupts handled within the task obey the same of the lower-privileged stack is restored as part of 


privilege rules as CALLs. the RET or IRET instruction operation. For subrou- 
— Conforming Code segments are accessible by tine calls that pass parameters on the stack and 
privilege levels which are the same or less privi- cross privilege levels, a fixed number of words (as 
leged than the conforming-code segment’s DPL. specified in the gate’s word count field) are copied 


from the previous stack to the current stack. The 
inter-segment RET instruction with a stack adjust- 
ment value will correctly restore the previous stack 
pointer upon return. 


— Both the requested privilege level (RPL) in the 
selector pointing to the gate and the task’s CPL 
must be of equal or greater privilege than the 
gate’s DPL. 


-— The code segment selected in the gate must be 
the same or more privileged than the task’s CPL. 
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Figure 4.15a. i486™ Microprocessor TSS and TSS Registers 
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B: Busy 486™ CPU TSS 
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31 30 29 28 27 26 25 24 23 22 21 2019 1817161514131211109 


1/O Ports Accessible: 2 —> 9, 12, 13, 


15, 20 — 24, 27, 33, 34, 40, 41, 48, 50, 52, 53, 58 —> 60, 62, 63, 96 — 127 


240440-20 


Figure 4.15b. Sample I/O Permission Bit Map 


4.4.5 CALL GATES 


Gates provide protected, indirect CALLs. One of the 
major uses of gates is to provide a secure method of 
privilege transfers within a task. Since the operating 
system defines all of the gates in a system, it can 
ensure that all gates only allow entry into a few trust- 
ed procedures (such as those which allocate memo- 
ry, or perform I/O). 


Gate descriptors follow the data access rules of priv- 
ilege; that is, gates can be accessed by a task if the 
EPL, is equal to or more privileged than the gate 
descriptor’s DPL. Gates follow the control transfer 
rules of privilege and therefore may only transfer 
control to a more privileged level. 


Call Gates are accessed via a CALL instruction and 
are syntactically identical to calling a normal subrou- 
tine. When an inter-level 486 Microprocessor call 
gate is activated, the following actions occur. 


1. Load CS:EIP from gate check for validity 
2. SS is pushed zero-extended to 32 bits 
3. ESP is pushed 


4. Copy Word Count 32-bit parameters from the 
old stack to the new stack 


5. Push Return address on stack 


The procedure is identical for 80286 Call gates, ex- 
cept that 16-bit parameters are copied and 16-bit 
registers are pushed. 


Interrupt Gates and Trap gates work in a similar 
fashion as the call gates, except there is no copying 
of parameters. The only difference between Trap 
and Interrupt gates is that control transfers through 
an Interrupt gate disable further interrupts (i.e. the IF 
bit is set to 0), and Trap gates leave the interrupt 
status unchanged. 


4.4.6 TASK SWITCHING 


A very important attribute of any multi-tasking/multi- 
‘user operating systems is its ability to rapidly switch 
between tasks or processes. The 486 Microproces- 
sor directly supports this operation by providing a 
task switch instruction in hardware. The 486 Micro- 
processor task switch operation saves the entire 


state of the machine (all of the registers, address 
space, and a link to the previous task), loads a new 
execution state, performs protection checks, and 
commences execution in the new task, in about 10 
microseconds. Like transfer of control via gates, the 
task switch operation is invoked by executing an in- 
ter-segment JMP or CALL instruction which refers to 
a Task State Segment (TSS), or a task gate descrip- 
tor in the GDT or LDT. An INT n instruction, excep- 
tion, trap, or external interrupt may also invoke the 
task switch operation if there is a task gate descrip- 
tor in the associated IDT descriptor slot. 


The TSS descriptor points to a segment (see Figure 
4.15) containing the entire 486 Microprocessor exe- 
cution state while a task gate descriptor contains a 
TSS selector. The 486 Microprocessor supports 
both 80286 and 486 Microprocessor style TSSs. Fig- 
ure 4.16 shows a 80286 TSS. The limit of a 486 
Microprocessor TSS must be greater than 0064H 
(OO2BH for a 80286 TSS), and can be as large as 4 
Gigabytes. In the additional TSS space, the operat- 
ing system is free to store additional information 
such as the reason the task Is inactive, time the task 


has spent running, and open files belong to the task. 


Each task must have a TSS associated with it. The 
current TSS is identified by a special register in the 
486 Microprocessor called the Task State Segment 
Register (TR). This register contains a selector refer- 
ring to the task state segment descriptor that de- 
fines the current TSS. A hidden base and limit regis- 
ter associated with TR are loaded whenever TR is 
loaded with a new selector. Returning from a task is 
accomplished by the IRET instruction. When IRET is 
executed, control is returned to the task which was 
interrupted. The current executing task’s state is 
saved in the TSS and the old task state is restored 
from its TSS. 


Several bits in the flag register and machine status 
word: (CRO) give information about the state of a 
task which are useful to the operating system. The 
Nested Task (NT) (bit 14 in EFLAGS) controls the 
function of the IRET instruction. If NT = 0, the IRET 
instruction performs the regular return; when NT = 
1, IRET performs a task switch operation back to the 


‘previous task. The NT bit is set or reset in the follow- 
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Figure 4.16. 80286 TSS 


When a CALL or INT instruction initiates a task 
switch, the new TSS will be marked busy and the 
back link field of the new TSS set to the old TSS 
selector. The NT bit of the new task is set by CALL 
or INT initiated task switches. An interrupt that does 
not cause a task switch will clear NT. (The NT bit will 
be restored after execution of the interrupt handler) 
NT may also be set or cleared by POPF or IRET 
instructions. 


The 486 Microprocessor task state segment is 
marked busy by changing the descriptor type field 
from TYPE 9H to TYPE BH. An 80286 TSS is 
marked busy by changing the descriptor type field 
from TYPE 1 to TYPE 3. Use of a selector that refer- 
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processor switches tasks, it sets the TS bit. The 486. 


‘Microprocessor detects the first use of a processor 


extension instruction after a task switch and causes 
the processor extension not available exception 7. 
The exception handler for exception 7 may then de- 
cide whether to save the state of the FPU. A proces- 
sor extension not present exception (7) will occur 


when attempting to execute a Floating Point or 


WAIT instruction if the Task Switched and Monitor 
coprocessor extension bits are both set (i.e. TS = 1 
and MP = 1). 


The T bit in the 486 Microprocessor TSS indicates 
that the processor should generate a debug excep- 
tion when switching to a task. If T = 1 then upon 
entry to a new task a debug exception 1 will be gen- 
erated. | | 


4.4.7 INITIALIZATION AND TRANSITION TO 
PROTECTED MODE | 


Since the 486 Microprocessor begins executing in 
Real Mode immediately after RESET it is necessary 
to initialize the system tables and registers with the 
appropriate values. 


The GDT and IDT registers must refer to a valid GDT 


‘and IDT. The IDT should be at least 256 bytes long, 


ences a busy task state segment causes an excep- - 


tion 13. : 


_ The Virtual Mode (VM) bit 17 is used to indicate if a 
task, is a virtual 8086 task. If VM = 1, then the tasks 
will use the Real Mode addressing mechanism. The 
virtual 8086 environment is only entered and exited 
via a task switch (see Section 4.6 Virtual Mode). 


The FPU’s state is not automatically saved when a 
task switch occurs, because the incoming task may 
not use the FPU. The Task Switched (TS) Bit (bit 3 in 
the CRO) helps deal with the FPU’s state in a multi- 
tasking environment. Whenever the 486 Micro- 


and GDT must contain descriptors for the initial 
code, and data segments. Figure 4.17 shows the ta- 
bles and Figure 4.18 the descriptors needed for a 
simple Protected Mode 486 Microprocessor system. 
It has a single code and single data/stack segment 
each four gigabytes long and a single privilege level 
PL = 0. 


The actual method of enabling Protected Mode is to 
load CRO with the PE bit set, via the MOV CRO, R/M 
instruction. This puts the 486 Microprocessor in Pro- 
tected Mode. | 


After enabling Protected Mode, the next instruction 
should execute an intersegment JMP to load the CS 
register and flush the instruction decode queue. The 
final step is to load all of the data segment registers 
with the initial selector values. 


An alternate approach to entering Protected Mode 
which is especially appropriate for multi-tasking op- 
erating systems, is to use the built in task-switch to 
load all of the registers. In this case the GDT would | 
contain two TSS descriptors in addition to the code 
and data descriptors needed for the first task. The 
first JMP instruction in Protected Mode would jump 
to the TSS causing a task switch and loading all of 
the registers with the values stored in the TSS. The 
Task State Segment Register should be initialized to 
point to a valid TSS descriptor since a task switch 
saves the state of the current task in a task state 


segment. 
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Figure 4.17. Simple Protected System 
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Figure 4.18. GDT Descriptors for Simple System 


4.4.8 TOOLS FOR BUILDING PROTECTED 
SYSTEMS 


In order to simplify the design of a protected multi- 
tasking system, Intel provides a tool which allows 
the system designer an easy method of constructing 
the data structures needed for a Protected Mode 
486 Microprocessor system. This tool is the builder 
BLD-386™, BLD-386 lets the operating system writ- 


er specify all of the segment descriptors discussed 


in the previous sections (LDTs, IDTs, GDTs, Gates, 
and TSSs) in a high-level language. 
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4.5 Paging 


4.5.1 PAGING CONCEPTS 


Paging is another type of memory management 
useful for virtual memory multitasking operating sys- 
tems. Unlike segmentation which modularizes pro- 
grams and data into variable length segments, pag- 
ing divides programs into multiple uniform size 
pages. Pages bear no direct relation to the logical 


ite 


structure of a program. While segment selectors can 
be considered the logical “name” of a program 


module or data structure, a page most likely corre- — 
sponds to only a portion of a module or data struc- » 


ture. 
By taking advantage of the locality of reference dis- 
played by most programs, only a small number of 


pages from each active task need be in memory at 
any one moment. 


4.5.2 PAGING ORGANIZATION 


4.5.2.1 Page Mechanism 


The 486 Microprocessor uses two levels of tables to — 


translate the linear address (from the segmentation 
unit) into a physical address. There are three com- 
ponents to the paging mechanism of the 486 Micro- 
processor: the page directory, the page tables, and 
the page itself (page frame). All memory-resident el- 
ements of the 486 Microprocessor paging mecha- 
nism are the same size, namely, 4 Kbytes. A uniform 
size for all of the elements simplifies memory alloca- 
tion and reallocation schemes, since there is no 


problem with memory fragmentation. Figure 4.19. 


shows how the paging mechanism works. 
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4.5.2.2 Page Descriptor Base Register | 


CR2 is the Page Fault Linear Address register. It 
holds the 32-bit linear address which caused the last 
page fault detected. 


CR3 is the Page Directory Physical Base Address 
Register. It contains the physical starting address of 
the Page Directory. The lower 12 bits of CR3 are 
always zero to ensure that the Page Directory is al- 
ways page aligned. Loading it via a MOV CR3, reg 


instruction causes the Page Table Entry cache to be 


flushed, as will a task switch through a TSS which 
changes the value of CRO. (See 4.5.5 Translation 


| Lookaside Buffer). 


4.5.2.3 Page Directory 


The Page Directory is 4 Kbytes long and allows up to — 
1024 Page Directory Entries. Each Page Directory 
Entry contains the address of the next level of ta- 
bles, the Page Tables and information about the 
page table. The contents of a Page Directory Entry 


are shown in Figure 4.20. The upper 10 bits of the 


linear address (A22-A31) are used as an index to 
select the correct Page Directory Entry. 


TWO LEVEL PAGING SCHEME 
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Figure 4.20. Page Directory Entry (Points to Page Table) 
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PAGE FRAME ADDRESS 31..12 
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Figure 4.21. Page Table Entry (Points to Page) 


4.5.2.4 Page Tables 


Each Page Table is 4 Kbytes and holds up to 1024 
Page Table Entries. Page Table Entries contain the 
starting address of the page frame and statistical 
information about the page (see Figure 4.21). Ad- 
dress bits A12-A21 are used as an index to select 
one of the 1024 Page Table Entries. The 20 upper- 
bit page frame address is concatenated with the 
lower 12 bits of the linear address to form the physi- 
cal address. Page tables can be shared between 
tasks and swapped to disks. 


4.5.2.5 Page Directory/Table Entries 


The lower 12 bits of the Page Table Entries and 
Page Directory Entries contain statistical information 
about pages and page tables respectively. The P 
(Present) bit 0 indicates if a Page Directory or Page 
Table entry can be used in address translation. If 
P = 1 the entry can be used for address translation 
if P = O the entry can not be used for translation, 


and all of the other bits are available for use by the. 


software. For example the remaining 31 bits could 
be used to indicate where on the disk the page is 
stored. 


The A (Accessed) bit 5, is set by the 486 Microproc- 
essor for both types of entries before a read or write 
access occurs to an address covered by the entry. 
The D (Dirty) bit 6 is set to 1 before a write to an 
address covered by that page table entry occurs. 
The D bit is undefined for Page Directory Entries. 
When the P, A and D bits are updated by the 486 
Microprocessor, the processor generates a Read- 
Modify-Write cycle which locks the bus and prevents 
conflicts with other processors or perpherials. Soft- 
ware which modifies these bits should use the LOCK 
prefix to ensure the integrity of the page tables in 
multi-master systems. 


The 3 bits marked OS Reserved in Figure 4.20 and 
Figure 4.21 (bits 9-11) are software definable. OSs 
are free to use these bits for whatever purpose they 
wish. An example use of the OS Reserved bits 
would be to store information about page aging. By 
keeping track of how long a page has been in mem- 
ory since being accessed, an operating system can 
implement a page replacement algorithm like Least 
Recently Used. 
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The (User/Supervisor) U/S bit 2 and the (Read/ 
Write) R/W bit 1 are used to provide protection attri- 
butes for individual pages. 


4.5.3 PAGE LEVEL PROTECTION 
(R/W, U/S BITS) 


The 486 microprocessor provides a set of protection 
attributes for paging systems. The paging mecha- 
nism distinguishes between two levels of protection: 
User which corresponds to level 3 of the segmenta- 
tion based protection, and supervisor which encom- 
passes all of the other protection levels (0, 1, 2). 


The R/W and U/S bits are used in conjunction with 
the WP bit in the flags register (EFLAGS). The 386 
microprocessor does not contain the WP bit. The 
WP bit has been added to the 486 microprocessor 
to protect read-only pages from supervisor write ac- 
cesses. The 386 microprocessor allows a read-only 
page to be written from protection levels 0, 1 or 2. 
WP = 0 is the 386 microprocessor compatible mode. 
When WP = 0 the supervisor can write to a read-only 
page as defined by the U/S and R/W bits. When 
WP=1 supervisor access to a read-only page 
(R/W =0) will cause a page fault (exception 14). 


Table 4.4 shows the affect of the WP, U/S and R/W 
bits on accessing memory. When WP = 0, the super- 
visor can write to pages regardless of the state of 
the R/W bit. When WP= 1 and R/W=0 the supervi- 
sor cannot write to a read-only page. A user attempt 
to access a supervisor only page (U/S=0), or write 
to a read only page will cause a page fault (excep- 
tion 14). 


The R/W and U/S bits provide protection from user 
access on a page by page basis since the bits are 
contained in the Page Table Entry and the Page Di- 
rectory Table. The U/S and R/W bits in the first level 
Page Directory Table apply to all entries in the page 
table pointed to by that directory entry. The U/S and 
R/W bits in the second level Page Table Entry apply 
only to the page described by that entry. The most 
restrictive of the U/S and R/W bits from the Page 
Directory Table and the Page Table Entry are used 
to address a page. 


Example: If the U/S and R/W bits for the Page Di- 
rectory entry were 10 (user read/execute) and the 


intel 


U/S and R/W bits for the Page Table Entry were 01 


_ (no user access at all), the access rights for the . 


page would be 01, the numerically smaller of the 
two. 


| Note that a given segment can be easily made read- 
only for level 0, 1 or 2 via use of segmented protec- 
tion mechanisms. (Section 4.4 Protection). 


4.5.4 PAGE CACHEABILITY. 
(PWT AND PCD BITS) 


PWT (page write through) and PCD (page cache dis- 
able) are two new bits defined in entries in both lev- 
els of the page table structure, the Page Directory 
Table and the Page Table Entry. PCD and PWT con- 
trol page cacheability and write policy. 


PWT controls write policy. PWT =1 defines a write- 
through policy for the current page. PWT =0 allows 
the possibility of write-back. PWT is ignored internal- 
ly because the 486 microprocessor has a write- 
through cache. PWT can be used to control the write 
policy of a second level cache. _ 


PCD controls cacheability. PCD =0 enables caching 
in the on-chip cache. PCD alone does not enable 
caching, it must be conditioned by the KEN # (cache 
enable) input signal and the state of the CD (cache 
disable bit) and NW (no write-through) bits in control 
register 0 (CRO). When PCD=1, caching is disabled 


regardless of the state of KEN#, CD and NW. (See 


Section 5.0, On-Chip Cache). 


The state of the PCD and PWT bits are driven out on 
the PCD and PWT pins during a memory access. 


The PWT and PCD bits for a bus cycle are obtained 


either from control register 3 (CR3), the Page Direc- 
tory Entry or the Page Table Entry; depending on the 


type of cycle run. However, when paging is disabled. 


(PG = 0 in CRO) or for cycles which bypass paging 
(i.e., 1/O (input/output) references, INTR (interrupt 
request) and HALT cycles), the PCD and PWT bits 
of CR3 are ignored. The i486 CPU assumes PCD = 
0 and PWT = 0 and drives these values on the PCD 
and PWT pins. 
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When paging is enabled (PG=1 in CRO), the bits 
from the page table entry are cached in the transla- 
tion lookaside buffer (TLB), and are driven any time 


‘the page mapped by the TLB entry iis referenced. 


For normal memory cycles run with paging enabled, 
the PWT and PCD bits are taken from the Page Ta- 
ble Entry. During TLB refresh cycles when the Page 
Directory and Page Table entries are read, the PWT 
and PCD bits must be obtained elsewhere. The bits 
are taken from CR3 when a Page Directory Entry is 
being read. The bits are taken from the Page Direc- 


tory Entry when the Page Table ere is being updat- 


ed. 


The PCD or PWT bits in CR3 are initialized to zero at 
reset, but can be set to any value by level 0 soft- 
ware. 


4.5.5 TRANSLATION LOOKASIDE BUFFER 


The 486 Microprocessor paging hardware is de- 
signed to support demand paged virtual memory 
systems. However, performance would degrade 
substantially if the processor was required to access 


_ two levels of tables for every memory reference. To 


solve this problem, the 486 Microprocessor keeps a 
cache of the most recently accessed pages, this _ 
cache is called the Translation Lookaside Buffer 
(TLB). The TLB is a four-way set associative 32-en- _ 
try page table cache. It automatically keeps the most 
commonly used Page Table Entries in the proces- 
sor. The 32-entry TLB coupled with a 4K page size, 
results in coverage of 128 Kbytes of memory ad- 
dresses. For many common multi-tasking systems, 
the TLB will have a hit rate of about 98%. This 
means that the processor will only have to access 
the two-level page structure on 2% of all memory 
references. Figure 4.22 illustrates how the TLB com- 
plements the 486 Microprocessors paging ' mecha- 


_ nism. 


Reading a new entry into the TLB (TLB refresh) is a 


~ two step process handled by the 486 microproces- 


sor hardware. The sequence of data eyelee to per- 
form a. TLB refresh are: 


Table 4.4. Page Level Protection Attributes 


0 
0 
1 
1 
0. 
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1 
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User Access" 


None 
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None | 
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Read/Execute 
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Supervisor Access 


Read/Write/Execute 
Read/Write/Execute 


Read/Write/Execute 


Read/Write/Execute 
Read/Execute 


Read/Write/Execute 


Read/Execute 
Read/Write/Execute 
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Read the correct Page Directory Entry, as point- 
ed to by the page base register and the upper 10 
bits of the linear address. The page base register 
is in control register 3. 


Optionally perform a locked read/write to set the 
accessed bit in the directory entry. The directory 
entry will actually get read twice if the 486 micro- 
processor needs to set any of the bits in the en- 
try. If the page directory entry changes between 
the first and second reads, the data returned for 
the second read will be used. 


Read the correct entry in the Page Table and 
place the entry in the TLB. 


Optionally perform a locked read/write to set the 
accessed and/or dirty bit in the page table entry. 
Again, note that the page table entry will actually 
get read twice if the 486 microprocessor needs 
to set any of the bits in the entry. Like the direc- 
tory entry, if the data changes between the first 
and second read the data returned for the sec- 
ond read will be used. 


la. 


2a. 


Note that the directory entry must always be read 
into the processor, since directory entries are never 
placed in the paging TLB. Page faults can be sig- 
naled from either the page directory read or the 
page table read. Page directory and page table en- 
tries may be placed in the 486 on-chip cache just 
like normal data. 


4.5.6 PAGING OPERATION 
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Figure 4.22. Translation Lookaside Buffer 


The paging hardware operates in the following fash- 
ion. The paging unit hardware receives a 32-bit lin- 
ear address from the segmentation unit. The upper 
20 linear address bits are compared with all 32 en- 
tries in the TLB to determine if there is a match. If 
there is a match (i.e., a TLB hit), then the 32-bit 
physical address is calculated and will be placed on 
the address bus. 
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However, if the page table entry is not in the TLB, 
the 486 Microprocessor will read the appropriate 
Page Directory Entry. lf P = 1 on the Page Directory 
Entry indicating that the page table is in memory, 
then the 486 Microprocessor will read the appropri- 
ate Page Table Entry and set the Access bit. If P = 
1 on the Page Table Entry indicating that the page is 
in memory, the 486 Microprocessor will update the 
Access and Dirty bits as needed and fetch the oper- 
and. The upper 20 bits of the linear address, read 
from the page table, will be stored in the TLB for 
future accesses. However, if P = 0 for either the 
Page Directory Entry or the Page Table Entry, then 
the processor will generate a page fault, an Excep- 
tion 14. 


The processor will also generate an exception 14 
page fault, if the memory reference violated the 
page protection attributes (i.e., U/S or R/W) (e.g., 
trying to write to a read-only page). CR2 will hold the 
linear address which caused the page fault. If a sec- 
ond page fault occurs, while the processor is at- 
tempting to enter the service routine for the first, 
then the processor will invoke the page fault (excep- 
tion 14) handler a second time, rather than the dou- 
ble fault (exception 8) handler. Since Exception 14 is 
classified as a fault, CS: EIP will point to the instruc- 
tion causing the page fault. The 16-bit error code 
pushed as part of the page fault handler will contain 
status bits which indicate the cause of the page 
fault. | 


The 16-bit error code is used by the operating sys- 
tem to determine how to handle the page fault. Fig- 
ure 4.23a shows the format of the page-fault error 
code and the interpretation of the bits. 


NOTE: 
Even though the bits in the error code (U/S, W/R, 
and P) have similar names as the bits in the Page 
Directory/Table Entries, the interpretation of the er- 
ror code bits is different. Figure 4.23b indicates 


. what type of access caused the page fault. 


3210 


15 
U 
U;}U;U;U;JU;JU;JU;U;JU;U;FU;JU;TU;UT |W 
Sil | S;R 
Figure 4.23a. Page Fault Error Code Format 


U/S: The U/S bit indicates whether the access 
causing the fault occurred when the processor was 
executing in User Mode (U/S = 1) or in Supervisor 
mode (U/S = 0). 


W/R: The W/R bit indicates whether the access 
causing the fault was a Read (W/R = 0) or a Write 
(W/R = 1). 


P: The P bit indicates whether a page fault was — 


caused by a not-present page (P = 0), orbya page 
level protection violation (P = i 


U: UNDEFINED 


Access Type 


_ Supervisor* Read 


Supervisor Write 
User Read 
User Write 


*Descriptor table access will fault with U/S = 0, even if the program 
is executing at level 3. 


Figure 4.23b. Type of Access 
Causing Page Fault 


4.5.7 OPERATING SYSTEM RESPONSIBILITIES 


The 486 Microprocessor takes care of the page ad- 
dress translation process, relieving the burden from 
an operating system in a demand-paged system. 
The operating system is responsible for setting up 
_ the initial, page tables, and handling any page faults. 
The operating system also is required to invalidate 
(i.e., flush) the TLB when any changes are made to 
any of the page table entries. The operating system 
must reload CR3 to cause the TLB to be flushed. 


Setting up the tables is simply a matter of loading 
CR3 with the address of the Page Directory, and 
allocating space for the Page Directory and the 
Page Tables. The primary responsibility of the oper- 
ating system is to implement a swapping policy and 
handle all of the page faults. 


A final concern of the operating system is to ensure 
that the TLB cache matches the information in the 
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particular, the 486 Microprocessor allows the simul- 
taneous execution of 8086 operating systems and 
its applications, and a 486 Microprocessor operating 
system and both 80286 and 486 Microprocessor ap- 
plications. Thus, in a multi-user 486 Microprocessor 
computer, one person could be running an MS-DOS 
spreadsheet, another person using MS-DOS, and a 
third person could be running multiple Unix utilities 
and applications. Each person in this scenario would 
believe that he had the computer completely to him- 
self. Figure 4.24 illustrates this concept. 


4.6.2 VIRTUAL 8086 MODE ADDRESSING 
MECHANISM . 


One of the major differences between 486 Micro- 
processor Real and Protected modes is how the 
segment selectors are interpreted. When the proc- 
essor is executing in Virtual 8086 Mode the segment 
registers are used in an identical fashion to Real 
Mode. The contents of the segment register is shift- 
ed left 4 bits and added to the offset to form the 
segment base linear address. 


The 486 Microprocessor aiiows the operating sys- 
tem to specify which programs use the 8086 style 
address mechanism, and which programs use Pro- 
tected Mode addressing, on a per task basis. | 
Through the use of paging, the one megabyte ad- | 
dress space of the Virtual Mode task can be mapped 
to anywhere in the 4 gigabyte linear address space 
of the 486 Microprocessor. Like Real Mode, Virtual 
Mode effective addresses (i.e., segment offsets) that 
exceed 64 Kbyte will cause an exception 13. Howev- 
er, these restrictions should not prove to be impor- 
tant, because most tasks running in Virtual 8086 


Mode will simply be existing 8086 application pro- 


paging tables. In particular, any time the operating © 


system sets the P present bit of page table entry to 
zero, the TLB must be flushed. Operating systems 
may want to take advantage of the fact that CR3 is 
stored as part of a TSS, to give every task or group 
of tasks its own set of page tables. 


4.6 Virtual 8086 Environment 


4.6.1 EXECUTING 8086 PROGRAMS 


The 486 Microprocessor allows the execution of 
- 8086 application programs in both Real Mode and in 
the Virtual 8086 Mode (Virtual Mode). Of the two 
methods, Virtual 8086 Mode offers the system de- 
signer the most flexibility. The Virtual 8086 Mode al- 
lows the execution of 8086 applications, while still 
allowing the system designer to take full advantage 
of the 486 Microprocessor protection mechanism. In 


grams. 


4.6.3 PAGING IN VIRTUAL MODE 


The paging hardware allows the concurrent running 
of multiple Virtual Mode tasks, and provides protec- 
tion and operating system isolation. Although it is 
not strictly necessary to have the paging hardware 
enabled to run Virtual Mode. tasks, it is needed in 
order to run multiple Virtual Mode tasks or to relo- 
cate the address space of a Virtual Mode ‘task to 


‘physical address space greater than one megabyte. 
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The paging hardware allows the 20-bit linear ad- 
dress produced by a Virtual Mode program to be 
divided into up to 256 pages. Each one of the pages ~ 
can be located anywhere within the maximum 4 gig- 
abyte physical address space of the 486 Microproc- 
essor. In addition, since CR3 (the Page Directory 
Base Register) is loaded by a task switch, each Vir- 
tual Mode task can use a different mapping scheme 
to map pages to different physical locations. 
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Finally, the paging hardware allows the sharing of 
the 8086 operating system code between multiple 
8086 applications. Figure 4.24 shows how the 486 
Microprocessor paging hardware enables multiple 
8086 programs to run under a virtual memory de- 
mand paged system. 


4.6.4 PROTECTION AND I/O PERMISSION 
BITMAP 


All Virtual 8086 Mode programs execute at privilege 
level 3, the level of least privilege. As such, Virtual 
8086 Mode programs are subject to all of the protec- 
tion checks defined in Protected Mode. (This is dif- 
ferent from Real Mode which implicitly is executing 
at privilege level 0, the level of greatest privilege.) 
Thus, an attempt to execute a privileged instruction 
when in Virtual 8086 Mode will cause an exception 
13 fault. 


The following are privileged instructions, which may 
be executed only at Privilege Level 0. Therefore, at- 
tempting to execute these instructions in Virtual 
8086 Mode (or anytime CPL > 0) causes an excep- 
tion 13 fault: 


8086 OS 


EMPTY 
TASK 2 PAGE 
TABLE 


PAGE DIRECTORY 
. TASK 2 


VIRTUAL MODE 
8086 TASK 


8086 OS 


EMPTY 


PAGE DIRECTORY TASK 1 PAGE 
ROOT TABLE 
VIRTUAL MODE PAGE DIRECTORY 

8086 TASK TASK 1 
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Figure 4.24. Virtual 8086 Environment Memory Management 
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LIDT; MOV DRn,reg; MOV reg,DRn; 
LGDT; MOV TRn,reg; MOV reg,TRn; 
LMSW; MOV CRn,reg; MOV reg,CRn. 
CLTS; 


HLT ;s 


Several instructions, particularly those applying to 
the multitasking model and protection model, are 
available only in Protected Mode. Therefore, at- 
tempting to execute the following instructions in 
Real Mode or in Virtual 8086 Mode generates an 


exception 6 fault: 


LTR; STR; 
LLDT; SLDT; 
LAR ; VERR ; 
LSL; VERW ; 
ARPL.. 


The instructions which are |OPL-sensitive in Protect- 
ed Mode are: 


IN; STI; 
OUT ; CLI 
INS; 

OUTS; 

REP INS; 

REP OUTS; 


PHYSICAL 
MEMORY 


02000000(H) 


]/ 
M0 


. AVAILABLE 


uy 


ml? 00000000(H) 


8086 OS 
MEMORY 


oO TASK 1 
MEMORY - 


TTT TASK 2 <3 386™ cpu os 
UM, MEMORY N MEMORY 
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In Virtual 8086 Mode, a slightly different set of in- 
structions are made IOPL-sensitive. The following in- 
- $tructions are IOPL-sensitive in Virtual 8086 Mode: 


INT n; STI; 
PUSHF ; CLI ; 
POPF ; IRET 


The PUSHF, POPF, and IRET instructions are |OPL- 
sensitive in Virtual 8086 Mode only. This provision 
allows the IF flag (interrupt enable flag) to be virtual- 
ized to the Virtual 8086 Mode program. The INT n 
software interrupt instruction is also IOPL-sensitive 
in Virtual 8086 Mode. Note, however, that the INT 3 
(opcode 0CCH), INTO, and BOUND instructions are 
not IOPL-sensitive in Virtual 8086 mode (they aren’t 
IOPL sensitive in Protected Mode either). 


Note that the |/O instructions (IN, OUT, INS, OUTS, 
REP INS, and REP OUTS) are not IOPL-sensitive in 
Virtual 8086 mode. Rather, the I/O instructions be- 
come automatically sensitive to the I/O Permission 
Bitmap contained in the 486 Microprocessor Task 
State Segment. The !/O Permission Bitmap, auto- 
matically used by the 486 Microprocessor in Virtual 
8086 Mode, is iliustrated by Figures 4.15a and 
4.15b. | 


The I/O Permission Bitmap can be viewed as a 0- 
64 Kbit bit string, which begins in memory at offset 
Bit_Map__Offset in the current TSS. Bit_Map__ 
Offset must be < DFFFH so the entire bit map and 
the byte FFH which follows the bit map are all at 
offsets < FFFFH from the TSS base. The 16-bit 
pointer Bit_.Map__Offset (15:0) is found in the word 
beginning at offset 66H (102 decimal) from the TSS 
base, as shown in Figure 4.15a. 
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EXAMPLE OF BITMAP FOR 1/O PORTS 0-255: 
Setting the TSS limit to {bit_.Map__Offset + 31 
+1**} [** see note below] will allow a 32-byte bit- 


| map for the 1/O ports #0-255, plus a terminator 


byte of all 1’s [** see note below]. This allows the 
I/O bitmap to control |/O Permission to I/O port 0- 
255 while causing an exception 13 fault on attempt- 
ed |/O to any I/O port 80256 through 65,565. 


**IMPORTANT IMPLEMENTATION NOTE: Beyond — 
the last byte of 1/O mapping information in the 1/0 
Permission Bitmap must be a byte containing all 1’s. 
The byte of all 1’s must be within the limit of the 486 
Microprocessor TSS segment (see Figure 4.15a). 


4.6.5 INTERRUPT HANDLING © 


In order to fully support the emulation of an 8086 
machine, interrupts in Virtual 8086 Mode are han- 
died in a unique fashion. When running in Virtual 
Mode all interrupts and exceptions involve a privi- 
lege change back to the host 486 Microprocessor 
operating system. The 486 Microprocessor operat- 
ing system determines if the interrupt comes from a 
Protected Mode application or from a Virtual Mode 
program by examining the VM bit in the EFLAGS 
image stored on the stack. | 


When a Virtual Mode program is interrupted and ex- 
ecution passes to the interrupt routine at level 0, the 
VM bit is cleared. However, the VM bit is still set in 
the EFLAG image on the stack. 


_ The 486 Microprocessor operating system in turn 


Each bit in the |/O Permission Bitmap corresponds | 
to a single byte-wide I/O port, as illustrated in Figure — 


4.15a. If a bit is 0, I/O to the corresponding byte- 
wide port can occur without generating an excep- 
tion. Otherwise the I/O instruction causes an excep- 
tion 13 fault. Since every byte-wide I/O port must be 
protectable, all bits corresponding to a word-wide or 
-dword-wide port must be 0 for the word-wide or 
dword-wide |/O to be permitted. If all the referenced 
bits are 0, the I/O will be allowed. If any referenced 
bits are 1, the attempted I/O will cause an exception 
13 fault. 


Due to the use of a pointer to the base of the I/O 
Permission Bitmap, the bitmap may be located any- 
where within the TSS, or may be ignored completely 
by pointing the Bit_Map__Offset (15:0) beyond the 
limit of the TSS segment. In the same manner, only 
a small portion of the 64K I/O space need have an 
associated map bit, by adjusting the TSS limit to 


handles the exception or interrupt and then returns 
control to the 8086 program. The 486 Microproces- 
sor operating system may choose to let the 8086 
operating system handle the interrupt or it may emu- 
late the function of the interrupt handler. For exam- 
ple, many 8086 operating system calls are accessed 


~ by PUSHing parameters on the stack, and then exe- 


cuting an INT n instruction. If the IOPL is set to 0 
then all INT n instructions will be intercepted by the 


_486 Microprocessor operating system. The 486 Mi- 


croprocessor operating system could emulate the 
8086 operating system’s call. Figure 4.25 shows 
how the 486 Microprocessor operating system could 


_ intercept an 8086 operating system’s call to “Open 


a File’. 


A486 Microprocessor operating system can provide 


a Virtual 8086 Environment which is totally transpar- 


ent to the application software via intercepting and 


truncate the bitmap. This eliminates the commitment | 


of 8K of memory when a complete bitmap is not 


required, while allowing the fully general case if de- 


sired. 
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then emulating 8086 operating system’s calls, and 
intercepting IN and OUT instructions. 
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4.6.6 ENTERING AND LEAVING VIRTUAL 
8086 MODE 


Virtual 8086 mode is entered by executing an IRET 
instruction (at CPL=0), or Task Switch (at any CPL) 
to a 486 Microprocessor task whose 486 Microproc- 
essor TSS has a FLAGS image containing a 1 in the 
VM bit position while the processor is executing in 
Protected Mode. That is, one way to enter Virtual 
8086 mode is to switch to a task with a 486 Micro- 
processor TSS that has a 1 in the VM bit in the 
EFLAGS image. The other way is to execute a 32-bit 


IRET instruction at privilege level 0, where the stack © 


has a 1 in the VM bit in the EFLAGS image. POPF 
does not affect the VM bit, even if the processor is in 
Protected Mode or level 0, and so cannot be used to 
enter Virtual 8086 Mode. PUSHF always pushes a 0 
in the VM bit, even if the processor is in Virtual 8086 
Mode, so that a program cannot tell if it is executing 
in REAL mode, or in Virtual 8086 mode. 


The VM bit can be set by executing an IRET instruc- 
tion only at privilege level 0, or by any instruction or 
‘Interrupt which causes a task switch in Protected 
Mode (with VM=1 in the new FLAGS image), and 
can be cleared only by an interrupt or exception in 
Virtual 8086 Mode. IRET and POPF instructions exe- 
cuted in REAL mode or Virtual 8086 mode will not 
change the value in the VM bit. 


The transition out of virtual 8086 mode to 486 Micro- 
- processor protected mode occurs only on receipt of 
an interrupt or exception (such as due to a sensitive 
instruction). In Virtual 8086 mode, all interrupts and 
exceptions vector through the protected mode IDT, 
and enter an interrupt handler in protected 486 Mi- 
croprocessor mode. That is, as. part of interrupt pro- 
cessing, the VM bit is cleared. 


Because the matching IRET must occur from level 0, 
if an Interrupt or Trap Gate is used to field an inter- 
rupt or exception out of Virtual 8086 mode, the Gate 
must perform an inter-level interrupt only to level 0. 
Interrupt or Trap Gates through conforming seg- 
ments, or through segments with DPL> 0, will raise a 
GP fault with the CS selector as the error code. 


4.6.6.1 Task Switches To/From Virtual © 
| 8086 Mode 


Tasks which can execute in virtual 8086 mode must 
be described by a TSS with the new 486 Microproc- 
essor format (TYPE 9 or 11 descriptor). | 


A task switch out of virtual 8086 mode will operate 
exactly the same as any other task switch out of a 
task with a 486 Microprocessor TSS. All of the pro- 
grammer visible state, including the FLAGS register 
with the VM bit set to 1, is stored in the TSS. 
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The segment registers in the TSS will contain 8086 
segment base values rather than selectors. 


A task switch into a task described by a 486 Micro- 
processor TSS will have an additional check to de- 
termine if the incoming task should be resumed in 
virtual 8086 mode. Tasks described by 80286 format 
TSSs cannot be resumed in virtual 8086 mode, so 
no check is required there (the FLAGS image in 
80286 format TSS has only the low order 16 FLAGS 
bits). Before loading the segment register images 
from a 486 Microprocessor TSS, the FLAGS image 
is loaded, so that the segment registers are loaded 
from the TSS image as 8086 segment base values. 
The task is now ready to resume in virtual 8086 exe- 
cution mode. 


4.6.6.2 Transitions Through Trap and Interrupt 
Gates, and IRET 


A task switch is one way to enter or exit virtual 8086 
mode. The other method is to exit through a Trap or 
Interrupt gate, as part of handling an interrupt, and 
to enter as part of executing an IRET instruction. 
The transition out must use a 486 Microprocessor 
Trap Gate (Type 14), or 486 Microprocessor Inter- 
rupt Gate (Type 15), which must point to a non-con- 
forming level O segment (DPL= 0) in order to permit 
the trap handler to IRET back to the Virtual 8086 
program. The Gate must point to a non-conforming 
level 0 segment to perform a level switch to level 0 
so that the matching IRET can change the VM bit. 
486 Microprocessor gates must be used, since 
80286 gates save only the low 16 bits of the FLAGS 
register, so that the VM bit will not be saved on tran- 
sitions through the 80286 gates. Also, the 16-bit 
IRET (presumably) used to terminate the 80286 in- 
terrupt handler will pop only the lower 16 bits from 
FLAGS, and will not affect the VM bit. The action 
taken for a 486 Microprocessor Trap or Interrupt 
gate if an interrupt occurs while the task is executing 
in virtual 8086 mode is given by the following se- 
quence. 


(1) Save the FLAGS register in a temp to push later. 
Turn off the VM and TF bits, and if the interrupt is 
serviced by an Interrupt Gate, turn off IF also. 


(2) Interrupt and Trap gates must perform a level 
switch from 3 (where the VM86 program exe- 
cutes) to level 0 (so IRET can return). This pro- 
cess involves a stack switch to the stack given in 
the TSS for privilege level 0. Save the Virtual 
8086 Mode SS and ESP registers to push in a 
later step: The segment register load of SS will 
be done as a Protected Mode segment load, 
since the VM bit was turned off above. 
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| Figure 4.25. Virtual 8086 Environment Interrupt and Call Handling 


(3) Push the 8086 segment register values onto the 


new stack, in the order: GS, FS, DS, ES. These - 


are pushed as 32-bit quantities, with undefined 
values in the upper 16. bits. Then load these 4 
registers with null selectors (0). 


oi (4) Push the old 8086 stack pointer onto the new 


stack by pushing the SS register (as 32-bits, high 
bits undefined), then pushing the 32-bit ESP reg- 
ister saved above. 


(5) Push the 32-bit FLAGS register saved in step 1. 


(6) Push the old 8086 instruction pointer onto the 
new stack by pushing the CS register (as 32-bits, 


high bits undefined), then pushing the 32-bit EIP 


register. 


(7) Load up the new CS:EIP Value from the interrupt 
_ gate, and begin execution of the interrupt routine 
in protected 486 Microprocessor mode. 


The transition out of virtual 8086 mode performs a 
level change and stack switch, in addition to chang- 
ing back to protected mode. In addition, all of the 
8086 segment register images are stored on the 
stack (behind the SS:ESP image), and then loaded 
with null (0) selectors before entering the interrupt 
handler. This will permit the handler to safely save 
and restore the DS, ES, FS, and GS registers as 
80286 selectors. This is needed so that interrupt 
handlers which don’t care about the mode of the 
interrupted program can use the same prolog and 
epilog code for state saving (i.e., push all registers in 
prolog, pop all in epilog) regardless of whether or not 
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a “native” mode or Virtual 8086 mode program was 
interrupted. Restoring null selectors to these regis- 
ters before executing the IRET will not cause a trap 
in the interrupt handler. Interrupt routines which ex- 
pect values in the segment registers, or return val- 
ues in segment registers will have to obtain/return 
values from the 8086 register images pushed onto 
the new stack. They will need to know the mode of 
the interrupted program in order to know where to 
find/return segment registers, and also to know how 
to interpret segment register values. 


The IRET instruction will perform the inverse of the 
above sequence. Only the extended 486 Microproc- 
essors IRET instruction (operand size = 32) can be 
used, and must be executed at level 0 to change the 
VM bit to 1. 


(1) lf the NT bit in the FLAGs register is on, an inter- 
task return is performed. The current state is 
stored in the current TSS, and the link field in the 
current TSS is used to locate the TSS for the 
interrupted task which is to be resumed. 


Otherwise, continue with the following sequence. 
(2) Read the FLAGS image from SS:8[ESP] into the 
FLAGS register. This will set VM to the value ac- 
tive in the interrupted routine. 
(3) Pop off the instruction pointer CS:EIP. EIP is 


popped first, then a 32-bit word is popped which 
contains the CS value in the lower 16 bits. If 
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VM=0, this CS load is done as a protected 
mode segment load. If VM=1, this will be done 
as an 8086 segment load. 


(4) Increment the ESP register by 4 to bypass the 
FLAGS image which was “popped” in step 1. 


(5) If VM=1, load segment registers ES, DS, FS, 
and GS from memory locations SS:[ESP+ 8], 
SS:[ESP + 12], SS:[ESP + 16], and 
SS:[ESP + 20], respectively, where the new val- 
ue of ESP stored in step 4 is used. Since VM= 1, 
these are done as 8086 segment register loads. 


Else if VM=0O, check that the selectors in ES, 
DS, FS, and GS are valid in the interrupted rou- 
tine. Null out invalid selectors to trap if an at- 
tempt is made to access through them. 


(6) If (RPL(CS) > CPL), pop the stack pointer 
SS:ESP from the stack. The ESP register is 
popped first, followed by 32-bits containing SS in 
the lower 16 bits. If VM=0, SS is loaded as a 
protected mode segment register load. If VM= 1, 
an 8086 segment register load is used. 


(7) Resume execution of the interrupted routine. The 
VM bit in the FLAGS register (restored from the 
interrupt routine’s stack image in step 1) deter- 
mines whether the processor resumes the inter- 
rupted routine in Protected mode of Virtual 8086 
mode. 


5.0 ON-CHIP CACHE 


To meet its performance goals the 486 microproces- 


sor contains an eight Kbyte cache. The cache is 


21 Bit 
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software transparent to maintain binary compatibility 
with previous generations of the x86 architecture. 


The on-chip cache has been designed for maximum 
flexibility and performance. The cache has several 
operating modes offering flexibility during program 
execution and debugging. Memory areas can be de- 
fined as non-cacheable by software and external 
hardware. Protocols for cache line invalidations and 
replacement are implemented in hardware, easing 
system design. 


5.1 Cache Organization 


The on-chip cache is a unified code and data cache. 
The cache is used for both instruction and data ac- 
cesses and acts on physical addresses. 


The cache organization is 4-way set associative and 
each line is 16 bytes wide. The eight Kbytes of 
cache memory are logically organized as 128 sets, 
each containing four lines. 


The cache memory is physically split into four 
2-Kbyte blocks each containing 128 lines (see Fig- 
ure 5.1). Associated with each 2-Kbyte block are 
128 21-bit tags. There is a valid bit for each line in 
the cache. Each line in the cache is either valid or 
not valid. There are no provisions for partially valid 
lines. 


-- 16=Byte Line Size -+| 
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Figure 5.1. On-Chip Cache Physical Organization 
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The write strategy of on-chip cache is write-through. 
All writes will drive an external write bus cycle in 
addition to writing the information to the internal 
cache if the write was a cache hit. A write to an 
address not contained in the internal cache will only 
be written to external memory. Cache allocations 
are not made on write misses. 


5.2 Cache Control 


Control of the cache is provided by the CD and NW 
bits in CRO. CD enables and disables the cache. NW 
controls memory write-through and invalidates. 


The CD and NW bits define four operating modes of 
the on-chip cache as given in Table 5.1. These 
modes provide flexibility in how hee on-chip cache is 
used. 


The cp and NW bits define four operating modes of 


the on-chip code and data cache, as given in the 


following table: 


Table 5.1. Cache Operating Modes 


Cache fills disabled, wrte-through and 
invalidates disabled 

Cache fills disabled, write-through and 
invalidates enabled 

INVALID. IF CRO is loaded with this 
configuration of bits, a GP fault with 
error code of 0 is raised. 

Cache fills enabled, write-through and 
invalidates enabled 


CD=1, NW=1 


The cache is completely disabled by setting 
CD=1 and NW=1 and then flushing the 
cache. This mode may be useful for debug- 
ging programs where it is important to see 
all memory cycles at the pins. Writes which 
hit in the cache will not appear on the exter- 
nal bus. 


It is possible to use the on-chip cache as 
fast static RAM by “pre-loading” certain 
memory areas into the cache and then set- 


ting CD= 1 and NW=1. Pre-loading can be - 


done by careful choice of memory refer- 
ences with the cache turned on or by use of 
the testability functions (see Section 8.2). 
When the cache is turned off the memory 
mapped by the cache is ‘frozen’ into the 
cache since fills and invalidates are dis- 
abled. | 


[ep| nw] OperatingMode | 
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CD=1, NW=0 


Cache fills are disabled but write-throughs 
and invalidates are enabled. This mode is 
the same as if the KEN# pin was strapped 

~ HIGH disabling cache fills. Write-throughs 
and invalidates may still occur to keep the 
cache valid. This mode is useful if the soft- 
ware must disable the cache for a short pe- 

_fiod of time, and then re-enable it. without 
flushing the onumar contents. 


NW=1 


INVALID. If CRO is loaded with this bit con- 
figuration, a General Protection fault with 
error code of 0 is raised. Note that this 
mode would imply a non-transparent write- 
‘back cache. A future processor may define 
this combination of bits to pnpeen a 
write-back cache. | 


NW=0 
This is the normal operating mode. 


Completely disabling the cache is. a two step pro- 
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cess. First CD and NW must be set to 1 and then the 
cache must be fiushed. if the cache is not flusned, 
cache hits on reads will still occur and data will be 
read from the cache. e 


5.3' Cache Line Fills 


Any area of memory can be cached in the 486 mi- 
croprocessor. Non-cacheable portions of memory 
can be defined by the external system or by soft- 
ware. The external system can inform the 486 micro- 
processor that a memory address is non-cacheable 
by returning the KEN # pin inactive during a memory 
access (refer to Section 7.2.3). Software can pre- 
vent certain pages from being cached by setting the 

PCD bit in the page table entry. | 


A read request can be generated from program op- 
eration or by an instruction pre-fetch. The data will 
be supplied from the on-chip cache if a cache hit 
occurs on the read address. If the address is not in 
the cache, a read request for the data is generated 
on the external bus. 


If the read request is to a cacheable portion of mem- 


‘ory, the 486 microprocessor initiates a cache line fill. 


During a line fill a 16-byte line is read into the 486 
microprocessor. 


Cache fills will only be generated for read misses. 
Write misses will never cause a line in the internal 
cache to be allocated. If a cache hit occurs on a 
write, the line will be updated. 
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Cache line fills can be performed over 8- and 16-bit 
busses using the dynamic bus sizing feature. Refer 
to Section 7.1.3 for a description of dynamic bus 
sizing. 


Refer to Section 7.2.3 for further information on 
cacheable cycles. 


5.4 Cache Line Invalidations 


The 486 microprocessor contains both a hardware 
and software mechanism for invalidating lines in its 
internal cache. Cache line invalidations are needed 
to keep the 486 microprocessor’s cache contents 
consistent with external memory. 


Refer to Section 7.2.8 for further information on 
cache line invalidations. 


5.5 Cache Replacement 


When a line needs to be placed in its internal cache 
the 486 microprocessor first checks to see if there is 
a non-valid line in the set that can be replaced. If all 
four lines in the set are valid, a pseudo least-recent- 
_ly-used mechanism is used to determine which line 
should be replaced. 


A valid bit is associated with each line in the cache. 
When a line needs to be placed in a set, the four 
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valid bits are checked to see if there is a non-valid 
line that can be replaced. If a non-valid line is found, 
that line is marked for replacement. 


The four lines in the set are labeled 10, I1, 12, and |3. 
The order in which the valid bits are checked during 
an invalidation is 10, 11, !2 and 13. All valid bits are 
cleared when the processor is reset or when the 
cache is flushed. 


Replacement in the cache is handled by a pseudo 
least recently used (LRU) mechanism when all four 
lines in a set are valid. Three bits, BO, B1 and B2, 
are defined for each of the 128 sets in the cache. 
These bits are called the LRU bits. The LRU bits are 
updated for every hit or replace in the cache. 


lf the most recent access to the set was to |0 or I1, 
BO is set to 1. BO is set to O if the most recent ac- 
cess was to [2 or I3. If the most recent access to 
10:11 was to 10, B1 is set to 1, else B1 is set to O. If 
the most recent access to 12:13 was to !2, B2 is set to 
1, else B2 is set to 0. 


The pseudo LRU mechanism works in the following 
manner. When a line must be replaced, the cache 
will first select which of 10:11 and 12:13 was least re- 
cently used. Then the cache will determine which of 
the two lines was least recently used and mark it for 
replacement. This decision tree is shown in Figure 
5.2. When the processor is reset or when the cache 
is flushed all 128 sets of three LRU bits are set to 0. 


All four lines in the set valid? Neg Replace non—valid line 


Replace 
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Figure 5.2. On-Chip Cache Replacement Strategy 
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5.6. Page Cacheability 


Two bits for cache control, PWT and PCD, j are de- 
fined in the page table and page directory entries. 
The state of these bits are driven out on the PWT 
and PCD pins during memory access cycles. 


The PWT bit controls write policy for second level 
caches used with the 486 microprocessor. Setting 
PWT=1 defines a write-through policy for the cur- 
rent page while PWT=0 allows the possibility of 
write-back. The state of PWT is ignored internally by 
‘the 486 microprocessor since the on-chip cache is 
write through. 


‘CACHE CONTROL LOGIC 
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The PCD bit controls cacheability on a page by page 
basis. The PCD bit is internally ANDed with the © 
KEN # signal to control cacheability on a cycle by 
cycle basis (see Figure 5.3). PCD=0 enables cach- 


ing while PCD = 1 forbids it. Note that cache fills are 


enabled when PCD=0 AND KEN# =0. This logical 
AND is implemented physically with a NOR gate. 


The state of the PCD bit in the page table entry is 
driven on the PCD pin when a page in external mem- 
ory is accessed. The state of the PCD pin informs 
the external: system of the cacheability of the re- 
quested information. The external system then re- 
turns KEN# telling the 486 microprocessor if the 
area is cacheable. The 486 microprocessor initiates 
a cache line fill if PCD and KEN# indicate that the 
requested information is cacheable. 


cD 
PAGE TABLE t (6 Ro). 
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Figure 5.3. Page Cacheability 
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The PCD bit is masked with the CD (cache disable) 
bit in control register 0 to determine the state of the 
PCD pin. If CD=1 the 486 microprocessor forces 
the PCD pin HIGH. If CD=0 the PCD pin is driven 
with the value for the page table entry/directory. See 
Figure 5.3. 


The PWT and PCD bits for a bus cycle are obtained 
from either CR3, the page directory or page table 
entry. These bits are assumed to be zero during real 
mode, whenever paging is disabled, or for cycles 
that bypass paging, (I/O references, interrupt ac- 
knowledge and Halt cycles), the PWT and PCD bits 
are taken from CR3. These bits are initialized to 0 on 
reset, but can be set to any value by level 0 soft- 
ware. 


When paging is enabled, the bits from the page table 
entry are cached in the TLB, and are driven any time 
the page mapped by the TLB entry is referenced. 
For normal memory cycles, PWT and PCD are taken 
from the page table entry. During TLB refresh cycles 
where the page table and directory entries are read, 
the PWT and PCD bits must be obtained elsewhere. 
During page table updates the bits are obtained from 
the page directory. When the page directory is up- 
dated the bits are obtained from CR3. 


5.7 Cache Flushing 


The on-chip cache can be flushed by external hard- 
ware or by software instructions. Flushing the cache 
clears all valid bits for all lines in the cache. The 
cache is flushed when external hardware asserts the 
FLUSH ¥# pin. 


The flush pin needs to be asserted for one clock if 
driven synchronously or for two clocks if driven 
asynchronously. The flush input is asynchronous but 
setup and hold times must be met. The flush pin 
should be deasserted after the cache flush is com- 
plete. Failure to deassert the pin will cause execu- 
tion to stop as the processor will be repeatedly flush- 
ing the cache. If external hardware activates flush in 
response to an I/O write, flush must be asserted for 
at least two clocks prior to ready being returned for 
_ the I/O write. This ensures that the flush completes 
before the CPU begins execution of the instruction 
following the OUT instruction. 


Flush is recognized during HOLD just like EADS#. 


The instructions INVD and WBINVD cause the on- 
cache to be flushed. External caches connected to 
the 486 microprocessor are signalled to flush their 
contents when these instructions are executed. 


WBINVD will cause an external write-back cache to 
write back dirty lines before flushing its contents. 
The external cache is signalled using the bus cycle 
definition pins and the byte enables (refer to Section 
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6.2.5 for the bus cycle definition pins and Section 
7.2.11 for special bus cycles). Refer to the 486 mi- 
croprocessor programmers reference manual for de- 
tailed instruction definitions. 


The results of the INVD and WBINVD instructions 
are identical for the operation of the 486 microproc- 
essor’s on-chip cache since the cache is write- 
through. Note that the INVD and WBINVD instruc- 
tions are machine dependent. Future members of 
the 486 microprocessor family may enalge the defi- 
nition of this instruction. 


5.8 Caching Translation Lookaside 
Buffer Entries 


The 486 microprocessor contains an integrated pag- 
ing unit with a translation lookaside buffer (TLB). The 
TLB contains 32 entries. The TLB has been en- 
hanced over the 386 microprocessor’s TLB by up- 
grading the replacement strategy to a pseudo-LRU 
(least recently used) algorithm. The pseudo-LRU re- 
placement algorithm is the same as that used in the 
on-chip cache. 


The paging TLB operation is automatic whenever 
paging is enabled. The TLB contains the most re- 
cently used page table entries. A page table entry 
translates the linear address pointing to a particular 
page to the physical address where the page is 
stored in memory (refer to Section 4.5, Paging). _ 


The paging unit will look up the linear address in the 
TLB in response to an internal bus request. The cor- 
responding physical address is passed on to the on- 
chip cache or the external bus (in the event of a 
cache miss) when the linear address is present in 
the TLB. 


The paging unit will access the page tables in exter- 
nal memory if the linear address is not in the TLB. 
The required page table entry will be read into the 
TLB and then the cache or bus cycle for the actual 
data will take place. The process of reading a new 
page table entry into the TLB is called a TLB refresh. 


A TLB refresh is a two step process: The paging unit 
must first read the page directory entry which points 
to the appropriate page table. The page table entry 
to be stored in the TLB is then read from the page 
table. Control register 3 (CR3) points to the base of 
the page directory table. 


The 486 microprocessor will allow page directory 
and page table entries (returned during TLB refresh- 
es) to be stored in the on-chip cache. Setting the 


~ PCD bits in CR3 and the page directory entry to 1 
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will prevent the page directory and page table en- 
tries from being stored in the on-chip cache (see 
Section 5.6, Page Cacheability). 
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6.0 HARDWARE INTERFACE 


6.1 Introduction 


The 486 microprocessor bus has been designed to 
be similar to the 386 microprocessor bus whenever 
possible. Several new features have been added to 
the 486 microprocessor bus resulting in increased 
performance and functionality. New features include 
a 1X clock, a burst bus mechanism for high-speed 
internal cache fills, a cache line invalidation mecha- 
nism, enhanced bus arbitration capabilities, a BS8 # 
bus sizing mechanism and parity support. 


The 486 microprocessor is driven by a 1X clock as 


opposed to a 2X clock in the 386 microprocessor. A 
25 MHz 486 microprocessor uses a 25 MHz clock in 
contrast to a 25 MHz 386 microprocessor which re- 


quires a 50 MHz clock. A 1X clock allows simpler 


system design by cutting in half the clock speed re- 
quired in the external system. 


Like the 386 microprocessor, the 486 microproces- 
sor has separate parallel busses for data and ad- 
dresses. The bidirectonal data bus is 32 bits in width. 
The address bus consists of two components: 30 
address lines (A2-A31) and 4 byte enable lines 
(BEO#—BE3#). The address bus addresses exter- 
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nal memory in the same manner as the 386 micro- 
processor: The address lines form the upper 30 bits 
of the address and the byte enables select individual 
bytes within a 4 byte location. The address lines are 
bidirectional for use in cache line invalidations. 


The 486 microprocessor’s burst bus mechanism en- 
ables high-speed cache fills from external memory. 
Burst cycles can strobe data into the processor at a 
rate of one item every clock. Non-burst cycles have 
a maximum rate of one item every two clocks. Burst 
cycles are not limited to cache fills: all bus cycles 
requiring more than a single data cycle can be burst- 
ed. 


The 486 microprocessor has a bus hold feature simi- 
lar to that of the 386 microprocessor. During bus 
hold, the 486 microprocessor relinquishes control of 
the local bus by floating its address, data and control 
busses. 


The 486 microprocessor has an address hold fea- 
ture in addition to bus hold. During address hold only 
the address bus is floated, the data and control bus- 
ses can remain active. Address hold is used for 
cache line invalidations. 


Ahead is a brief description of the 486 microproces- 
sor input and output signals arranged by functional - 
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Figure 6.1. Functional Signal Groupings 
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groups. Before beginning the signal descriptions a 
few terms need to be defined. The # symbol at the 
end of a signal name indicates the active, or assert- 
ed, state occurs when the signal is at a low voltage. 
When a # is not present after the signal name, the 
signal is active at the high voltage level. The term 
“ready” is used to indicate that the cycle is terminat- 
ed with RDY # or BRDY#. 


Section 6 and 7 will discuss bus cycles and data 
cycles. A bus cycle is at least two clocks long and 
begins with ADS # active in the first clock and ready 
active in the last clock. Data is transferred to or from 
the 486 microprocessor during a data cycle. A bus 
cycle contains one or more data cycles. 


6.2 Signal Descriptions 


6.2.1 CLOCK (CLK) 


CLK provides the fundamental timing and the inter- 
nal operating frequency for the 486 microprocessor. 
All external timing parameters are specified with re- 
spect to the rising edge of CLK. 


The 486 microprocessor can operate over a wide 
frequency range but CLK’s frequency cannot 
change rapidly while RESET is inactive. CLK’s fre- 
quency must be stable for proper chip operation 
since a single edge of CLK is used internally to gen- 
erate two phases. CLK only needs TTL levels for 
proper operation. Figure 6.2 illustrates the CLK 
waveform. 


6.2.2 Address Bus (A31-A2, BEO#-BE3#) 


A31-A2 and BEO#-—BE3# form the address bus 
and provide physical memory and I/O port address- 
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es. The 486 microprocessor is capable of address- 
ing 4 gigabytes of physical memory space 
(OOO00000H through FFFFFFFFH), and 64 Kbytes 
of I/O address space (O0000000H through 
OOOOFFFFRH). A31-A2 identify addresses to a 4-byte 
location. BEO#—BE3# identify which bytes within 
the 4-byte location are involved in the current trans- 
fer. 


Addresses are driven back into the 486 microproc- 
essor over A31—A4 during cache line invalidations. 
The address lines are active HIGH. When used as 
inputs into the processor, A31—A4 must meet the 
setup and hold times, too and te3. A31—A2 are not 
driven during bus or address hold. 


The byte enable outputs, BEO#—BE3#, determine 
which bytes must be driven valid for read and write 
cycles to external memory. 


BE3# applies to D24-D31 
BE2# applies to D16-D23 
BE1# applies to D8-D15 
BEO# applies to DO-D7 


BEO#-BE3# can be decoded to generate AO, A1 
and BHE# signals used in 8- and 16-bit systems 
(see Table 7.5). BEO#—BE3# are active LOW and 
are not driven during bus hold. 


6.2.3 DATA LINES (D31-D0) 


The bidirectional lines, D31-D0, form the data bus 
for the 486 microprocessor. DO-D7 define the least 
significant byte and D24-—D31 the most significant 
byte. Data transfers to 8- or 16-bit devices is possi- 
ble using the data bus sizing feature controlled by 


- the BS8# or BS16# input pins. 


t3 1.5V 


tx ty 
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Figure 6.2. CLK waveform 
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D31-D0 are active HIGH. For reads, D31—DO must 
meet the setup and hold times, too and tog. D31-D0 
are not driven during read cycles and bus hold. 


/ 


6.2.4 PARITY 
Data Parity Input/Outputs (DPO-DP3) 


DPO-DP3 are the data parity pins for the processor. 
There is one pin for each byte of the data bus. Even 
parity is generated or checked by the parity genera- 
tors/checkers. Even parity means that there are an 
even number of HIGH inputs on the eight corre- 
sponding data bus pins and parity pin. 


Data parity is generated on all write data cycles with 
the same timing as the data driven by the 486 micro- 
processor. Even parity information must be driven 
back to the 486 microprocessor on these pins with 
the same timing as read information to insure that 
the correct parity check status is indicated by the 
486 microprocessor. 


The values read on these pins do not affect program 
execution. it is the responsibility of the system to 
take appropriate actions if a parity error occurs. 


Input signals on DPO-DP3 must meet setup and 
hold times too and tog for proper operation. 


Parity Status Output (PCHK #) 


Parity status is driven on the PCHK # pin, and a pari- 
ty error is indicated by this pin being LOW. PCHK# 
is driven the clock after ready for read operations to 
indicate the parity status for the data sampled at the 
end of the previous clock. Parity is checked during 
code reads, memory reads and I/O reads. Parity is 
not checked during interrupt acknowledge cycles. 
PCHK# only checks the parity status for enabled 
bytes as indicated by the byte enable and bus size 
signals. It is valid only in the clock immediately after 
read data is returned to the 486 microprocessor. At 
all other times it is inactive (iGh): PCHK # is never 
floated. 
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signal is asserted. M/IO# distinguishes between | 
memory and !/O cycles, D/C# distinguishes be- 
tween data and control cycles and W/R# distin- 
guishes between write and igad cycles. 


Bus cycle definitions as a ancien of M/IO#, Dice 
and W/R# are given in Table 6.1. Note there is a 
difference between the 486 microprocessor and 386 
microprocessor bus cycle definitions. The halt bus 
cycle type has been moved to location 001 in the 
486 microprocessor from location 101 in the 386 mi- 
croprocessor. Location 101 is now reserved and will 
never be generates by the 486 microprocessor. 


Table 6.1. AD5# Initiated Bus Cycle Definitions 


M/IO# D/C# W/R# Bus Cycle Initiated 

9 ) 0 Interrupt Acknowledge 
0 0 1 —~Halt/Special Cycle 
0 1 om |/O Read 
0 1 1 I/O Write 
1 0 0 Code Read © 
1 0 1 ~-Reserved 

| 1 om Memory Read’ > 

| 1 a | 1 Memory Write | 


Special bus cycles are discussed in Section 7.2.11. 
Bus Lock Output (LOCK #) 


LOCK# indicates that the 486 microprocessor is 
running a read-modify-write cycle where the external 
bus must not be relinquished between the read and 
write cycles. Read-modify-write cycles are used to 
implement memory-based semaphores. Multiple 
reads or writes can be locked. 


When LOCK # is asserted, the current bus cycle is 
locked and the 486 microprocessor should be al- 
lowed exclusive access to the system bus. LOCK # 
goes active in the first clock of the first locked bus 
cycle and goes inactive after ready is returned indi- 
cating the last locked bus cycle. 


The 486 microprocessor will not acknowledge bus 


Driving ee is the only effect that bad input pari-- 


ty has on the 486 microprocessor. The 486 micro- 
processor will not vector to a bus error interrupt 
when bad data parity is returned. In systems that will 
not employ parity, PCHK# can be ignored. In sys- 
tems not using parity, DPO-DP3 should be connect- 
ed to Vcc through a pullup resistor. 


6.2.5 BUS CYCLE DEFINITION 
M/i0#, D/C#, W/R# Outputs 


M/lO#, D/C# and W/R# are the primary bus cycle 
definition signals. They are driven valid as the ADS # 
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hold when LOCK # is asserted (though it will allow 
an address hold). LOCK # is active LOW and is float- 


ed during bus hold. Locked read cycles will not be 


transformed into cache fill cycles if KEN# is re- 
turned active. Refer to Section 7.2.6 for a detailed 
discussion of Locked bus cycles. 


Pseudo-Lock Output (PLOCK #) 
The pseudo-lock feature allows atomic reads and 


writes of memory operands greater than 32 bits. 
These operands require more than one cycle to 
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transfer. The 486 microprocessor asserts PLOCK # 
during floating point long reads and writes (64 bits), 
segment table descriptor reads (64 bits) and cache 
line fills (128 bits). 


When PLOCK # is asserted no other master will be 
given control of the bus between cycles. A bus hold 
request (HOLD) is not acknowledged during pseudo- 
locked reads and writes, with one exception. During 
non-cacheable non-bursted code prefetches, HOLD 
is recognized on memory cycle boundaries even 
though PLOCK # is asserted. The 486 microproces- 
sor will drive PLOCK # active until the addresses for 
the last bus cycle of the transaction have been driv- 
en regardless of whether BRDY # or RDY# are re- 
turned. | 


A pseudo-locked transfer is meaningful only if the 
memory operand .is aligned and if its completely con- 
tained within a single cache line. A 64-bit floating 
point number must be aligned to an 8-byte boundary 
to guarantee an atomic access. 


Normally PLOCK# and BLAST# are inverse of 

each other. However during the first cycle of a 64-bit 

_ floating point write, both PLOCK# and BLAST # will 
_be asserted. 


Since PLOCK# is a function of the bus size and 
KEN# inputs, PLOCK# should be sampled only in 
the clock ready is returned. This pin is active LOW 
and is not driven during bus hold. Refer to Section 
7.2.7 for a detailed discussion of pseudo-locked bus 
cycles. . 


6.2.6 BUS CONTROL © 


The bus control signals allow the processor to indi- 
cate when a bus cycle has begun, and allow other 
system hardware to control burst cycles, data bus 
width and bus cycle termination. 


Address Status Output (ADS #) 


The ADS# output indicates that the address and 
bus cycle definition signals are valid. This signal will 
go. active in the first clock of a bus cycle and go 
inactive in the second and subsequent clocks of the 
cycle. ADS # is also inactive when the bus is idle. 


ADS # is used by external bus circuitry as the indica- 
tion that the processor has started a bus cycle. The 
external circuit must sample the bus cycle definition 
pins on the next rising edge of the clock after ADS # 
is driven active. 


ADS# is active LOW and is not driven during bus 
hold. _ 
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Non-burst Ready Input (RDY #) 


RDY # indicates that the current bus cycle is com- 
plete. In response to a read, RDY # indicates that 
the external system has presented valid data on the 
data pins. In response to a write request, RDY # indi- 
cates that the external system has accepted the 486 
microprocessor data. RDY# is ignored when the 
bus is idle and at the end of the first clock of the bus 
cycle. Since RDY # is sampled during address hold, 
data can be returned to the processor when AHOLD 
is active. 


RDY # is active LOW, and is not provided with an 
internal pullup resistor. This input must satisfy setup 
and hold times t1¢ and t;7 for proper chip operation. 


6.2.7 BURST CONTROL 
Burst Ready Input (BRDY #) 


BRDY # performs the same function during a burst 
cycle that RDY # performs during a non-burst cycle. 
BRDY # indicates that the external system has pre- 
sented valid data on the data pins in response to a 
read or that the external system has accepted the 
486 microprocessor data in response to a write. 
BRDY # is ignored when the bus is idle and at the 
end of the first clock in a bus cycle. 


During a burst cycle, BRDY # will be sampled each 
clock, and if active, the data presented on the data 
bus pins will be strobed into the 486 microprocessor. 
ADS# is negated during the second through last 
data cycles in the burst, but address lines A2-A3 
and byte enables will change to reflect the next data 
item expected by the 486 microprocessor. 


If RDY# is returned simultaneously with BRDY #, 
BRDY # is ignored. and the burst cycle is premature- 
ly aborted. An additional complete bus cycle will be 
initiated after an aborted burst cycle if the cache line 
fill was not complete. BRDY # is treated as a normal 
ready for the last data cycle in a burst transfer or for 
non-burstable cycles. Refer to Section 7.2.2 for 
burst cycle timing. 


BRDY # is active LOW and is provided with a small 
internal pullup resistor. BRDY # must satisfy the set- 
up and hold times tyg and ty7. | 


- Burst Last Output (BLAST #) 
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BLAST # indicates that the next time BRDY # is re- 
turned it will be treated as a normal RDY #, terminat- 
ing the line fill or other multiple-data-cycle transfer. 
BLAST # is active for all bus cycles regardless of 
whether they are cacheable or not. This pin is active 
LOW and is not driven during bus hold. 


intel 


6.2.8 INTERRUPT SIGNALS (RESET, INTR, NM) 


The interrupt signals can interrupt or suspend exe- — 


cution of the processor’s current instruction stream. 
Reset Input (RESET) 


RESET forces the 486 microprocessor to begin exe- 
cution at a known state. For a power-up (cold start) 


reset, Vcc and CLK must reach their proper DC and > 


AC specifications for at least 1 ms before the 486 
microprocessor begins instruction execution. The 
RESET pin should remain active during this time to 
ensure proper. 486 microprocessor operation. How- 
ever, for a warm boot-up case, RESET is required to 
remain active for a minimum of 15 clocks. The testa- 
bility operating modes are programmed by the falling 
(inactive going) edge of RESET. (Refer to Section 
8.0 for a description of the test modes during reset.) 


Maskable Interrupt Request Input (INTR) 


INTR indicates that an external interrupt has been 
generated. Interrupt processing is initiated if the IF 
flag is active in the EFLAGS register. 


The 486 microprocessor will generate two locked in- 
terrupt acknowledge bus cycles in response to as- 
‘serting the INTR pin. An 8-bit interrupt number will 
be latched from an external interrupt controller at 
the end of the second interrupt acknowledge cycle. 


\ 
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6. 2. 9 BUS ARBITRATION SIGNALS 


This section describes the mechanism by which the 
processor relinquishes control of its local bus when 
requested by another bus master. | 


Bus Request Output (BREQ) 


The 486 asserts BREQ whenever a bus cycle is 
pending internally. Thus, BREQ is always asserted in 
the first clock of a bus cycle, along with ADS #. Fur- 
thermore, if the 486 is currently. not driving the bus 
(due to HOLD, AHOLD, or BOFF #), BREQ is assert- 
ed in the same clock that ADS# would have been 
asserted if the processor were driving the bus. After 
the first clock of the bus cycle, BREQ may change 
state. It will be asserted if additional cycles are nec- 
essary to complete a transfer (via BS8#, BS16#, 
KEN#), or if more cycles are pending internally. 
However, if no additional cycles are necessary to 
complete the current transfer, BREQ can be negat- 


_ ed before ready comes back for the current cycle. 


INTR must remain active until the interrupt acknowl- | 


edges have been performed to assure program in- 
terruption. Refer to Section 7.2.10 for a detailed dis- 
cussion of interrupt acknowledge cycles. 


The INTR pin is active HIGH and is not provided with 
an internal pulldown resistor. INTR is asynchronous, 
but the INTR setup and hold times, tap and to;, must 
be met to assure recognition on any specific clock. 


Non-maskable Interrupt Request Input (NMI) 


NMI is the non-maskable interrupt request signal. 
Asserting NMI causes an interrupt with an internally 
supplied vector value of 2. External interrupt ac- 
knowledge cycles are not generated since the NMI 
interrupt vector is internally generated. When NMI 
processing begins, the NMI signal will be masked 
internally until the IRET instruction is executed. 


NMI is rising edge sensitive after internal synchroni- 
zation. NMI must be held LOW for at least four CLK 
periods before this rising edge for proper operation. 
NMI is not provided with an internal pulldown resis- 
tor. NMI is asynchronous but setup and hold times, 
too and to, must be met to assure recognition on any 
specific clock. 


External logic can use the BREQ signal to arbitrate 
among multiple processors. This pin is driven re- 
gardiess of the state of bus hold or address hold. 
BREQ is active HIGH and is never floated. During a 
hold state, internal events may cause BREQ to be 
deasserted prior to any bus cycles. 


Bus Hold Request Input (HOLD) 


HOLD allows another bus master complete control 


of the 486 microprocessor bus. The 486 microproc- 


essor will respond to an active HOLD signal by as- 
serting HLDA and placing most of its output and in- 


_ put/output pins in a high impedance state (floated) 


after completing its current bus cycle, burst cycle, or 
sequence of locked cycles. The BREQ, HLDA, 
PCHK# and FERR#¥ pins are not floated during bus 
hold. The 486 microprocessor will maintain its bus in 
this state until the HOLD is deasserted. Refer to 
Section 7.2.9 for timing diagrams for a bus hold cy- 


cle. 
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Unlike the 386 microprocessor, the 486 microproc- 
essor will recognize HOLD during reset. Pullup resis- 
tors are not provided for the outputs that are floated 
in response to HOLD. HOLD is active HIGH and is 
not provided with an internal pulldown resistor. 
HOLD must satisfy setup and hold times he and tig 
for proper chip operation. 


Bus Hold Acknowledge cute (HLDA) 


HLDA indicates that the 486 microprocessor has 
given the bus to another local bus master. HLDA 
goes active in response to a hold request presented 
on the HOLD pin. HLDA is driven active in the same 
clock that the 486 microprocessor floats its bus. 
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HLDA will be driven inactive when leaving bus hold 
and the 486 microprocessor will resume driving the 
bus. The 486 microprocessor will not cease internal 
activity during bus hold since the internal cache will 
satisfy the majority of bus requests. HLDA is active 
HIGH and remains driven during bus hold. 


Backoff Input (BOFF #) 


Asserting the BOFF# input forces the 486 micro- 
processor to release control of its bus in the next 
clock. The pins floated are exactly the same as in 
response to HOLD. The response to BOFF # differs 
from the response to HOLD in two ways: First, the 
bus is floated immediately in response to BOFF # 
while the 486 completes the current bus cycle be- 
fore floating its bus in response to HOLD. Second 
the 486 does not assert HLDA in response to 
BOFF #. 


The processor remains in bus hold until BOFF # is 
negated. Upon negation, the 486 microprocessor re- 
starts the bus cycle aborted when BOFF¥# was as- 
serted. To the internal execution engine the effect of 
BOFF # is the same as inserting a few wait states to 
the original cycle. Refer to Section 7.2.12 for a de- 
scription of bus cycle restart. 


Any data returned to the processor while BOFF # is 
asserted is ignored. BOFF # has higher priority than 
RDY# or BRDY#. If both BOFF# and ready are 
returned in the same clock, BOFF # takes effect. If 
BOFF# is asserted while the bus is idle, the 486 
microprocessor will float tts bus in the next clock. 
BOFF# is active LOW and must meet setup and 
hold times tg and t,9 for proper chip operation. 


6.2.10 CACHE INVALIDATION 


The AHOLD and EADS# inputs are used during 
cache invalidation cycles. AHOLD conditions the 
486 microprocessors address lines, A4—A31, to ac- 
cept an address input. EADS # indicates that an ex- 
ternal address is actually valid on the address 
inputs. Activating EADS# will cause the 486 mi- 
croprocessor to read the external address bus 
and perform an internal cache invalidation cycle to 
the address indicated. Refer to Section 7.2.8 for 
cache invalidation cycle timing. 


Address Hold Request Input (AHOLD) 


AHOLD is the address hold request. It allows anoth- 
er bus master access to the 486 microprocessor 
address bus for performing an internal cache invali- 
dation cycle. Asserting AHOLD will force the 486 mi- 
croprocessor to stop driving its address bus in the 
next clock. While AHOLD is active only the address 
bus will be floated, the remainder of the bus can 
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remain active. For example, data can be returned for 
a previously specified bus cycle when AHOLD is ac- 
tive. The 486 microprocessor will not initiate another 
bus cycle during address hold. Since the 486 micro- 
processor floats its bus immediately in response to 
AHOLD, an address hold acknowledge is not re- 
quired. If AHOLD is asserted while a bus cycle is in 
progress, and no readies are returned during the 
time AHOLD is asserted, the 486 will redrive the 
same address (that it originally sent out) once 
AHOLD is negated. 


AHOLD is recognized during reset. Since the entire 
cache is invalidated by reset, any invalidation cycles 
run during reset will be unnecessary. AHOLD is ac- 
tive HIGH and is provided with a small internal pull- 
down resistor. It must satisfy the setup and hold 
times tg and tg for proper chip operation. This pin 
determines whether or not the built in self test fea- 
tures of the 486 microprocessor will be exercised on 
assertion of RESET. 


External Address Valid Input (EADS #) 


EADS # indicates that a valid external address has 
been driven onto the 486 address pins. This address 
will be used to perform an internal cache invalidation 
cycle. The external address will be checked with the 
current cache contents. If the address specified 
matches any areas in the cache, that area will imme- 
diately be invalidated. 


An invalidation cycle may be run by asserting 
EADS# regardless of the state of AHOLD, HOLD 
and BOFF #. EADS # is active LOW and is provided 
with an internal pullup resistor. EADS # must satisfy 
the setup and hold times ty2 and t;3 for proper chip 
operation. 


6.2.11 CACHE CONTROL 
Cache Enable Input (KEN #) 


KEN# is the cache enable pin. KEN# is used to 
determine whether the data being returned by the 
current cycle is cacheable. When KEN# is active 
and the 486 microprocessor generates a cycle that 
can be cached (most any memory read cycle), the 
cycle will be transformed into a cache line fill cycle. 


A cache line is 16 bytes long. During the first cycle of 
a cache line fill the byte-enable pins should be ig- 
nored and data should be returned as if all four byte 
enables were asserted. The 486 microprocessor will 
run between 4 and 16 contiguous bus cycles to fill 
the line depending on the bus data width selected by 
BS8# and BS16#. Refer to Section 7.2.3 for a de- 
scription of cache line fill cycles. 
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The KEN # input is active LOW and is provided with 
a small internal pullup resistor. It must satisfy the - 


setup and hold times t;4 and ae for proper og Pe 
eration. | 


Cache Flush Input (FLUSH #) 


The FLUSH # input forces the 486 microprocessor 


to flush its entire internal cache. FLUSH # is active 
LOW and need only be asserted for one clock. 
FLUSH # is asynchronous but setup and hold times 
tog and to; must be met for eeeanen on any spe- 
cific Clock. 


FLUSH # also determines whether or not the tristate 
test mode of the 486 microprocessor will be invoked 
on assertion of RESET. 


6.2.12 PAGE CACHEABILITY (PWT, PCD) 


The PWT and PCD output signals correspond to two 
user attribute bits in the page table entry. When pag- 
ing is enabled, PWT and PCD correspond to bits 3 
and 4 of the page table entry respectively. When 
paging is disabled, or for cycies that are not paged 
when paging is enabled (for example I/O cycles) 
PWT and PCD correspond to bits 3 and 4 in control 
register 3. 


~ PCD is masked by the CD (cache sacus bit in con- 

trol register O (CRO). When CD=1 (cache line fills 
disabled) the 486 microprocessor forces PCD HIGH. 
‘When CD=0, PCD is driven with the value of the 
page table entry/directory. 


The purpose of PCD is to provide a cacheable/non- 
cacheable indication on a page by page basis. The 
486 will not perform a cache fill to any page in which 
bit 4 of the page table entry is set. PWT corresponds 
to the write-back bit and can be used by an external 
cache to provide this functionality. PCD and PWT 
bits are assigned to be zero. during real mode or 
whenever paging is disabled. Refer to Sections 4.5.4 
and 5.6 for a discussion of non-cacheable pages. 


PCD and PWT have the same timing as the cycle 
definition pins (M/IO#, D/C#, W/R#). PCD and 
PWT are active HIGH and are not driven during Bue 
hold. i, | 


6.2.13 NUMERIC ERROR REPORTING 
(FERR#, IGNNE#) 


To allow PC-type floating point error acca the 
486 microprocessor provides two pins, FERR# and 
IGNNE #. | 


Floating | Point Error Output FERRE) 


The 486 microprocessor asserts FERR# unenevei 
an unmasked floating. point error is encountered. 


FERR # is similar to the ERROR# pin on the 387 


math coprocessor. FERR# can be used by external 
logic for PC-type floating point error reporting in 486 
microprocessor systems. FERR#. is active LOW, 
and is not floated during bus hold. 


In some cases, FERR# is asserted when the next 
floating point instruction is encountered and in other 
cases it is asserted before the next floating point 
instruction is encountered depending upon the exe- 
cution state of the instruction causing the exception. 


The following class of floating point exceptions drive 
FERR # at the time the exception occurs (i.e., before 
encountering the next floating point instruction). 


1. The stack fault, invalid operation, and denormal 
exceptions on all transcendental instructions, in- 
teger arithmetic instructions, FSQRT, FSCALE, 

_FPREM(1), FXTRACT, FBLD, and FBSTP. 


. Any exceptions on store instructions uneleing 
integer store instructions)... 


The following class of floating point exceptions drive 
FERR# only after hae the next ae 
point instruction. 


1. Exceptions other than on all. Gansesndantal in- 
structions, integer arithmetic _ instructions, 
FSQRT, FSCALE, FPREM(1), FXTRACT, FBLD,. 
‘and FBSTP. 


. Any exception on all basic arithmetic, load, com- 
pare, and control instructions (i.e., all other in- 
structions). 


Ignore Numeric Error Input (IGNNE #) 


The 486 microprocessor will ignore a numeric error 


and continue executing non-control floating point in- 
structions when IGNNE# is asserted, but FERR# 
will still be activated. When deasserted, the 486 mi- 
croprocessor will freeze on a non-control floating 
point instruction if a previous instruction caused an 
error. IGNNE# has no effect when the NE bit in con- 
trol register O is set. , 


- The IGNNE# input is active LOW and is provided 
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with a small internal pullup resistor. This input is 
asynchronous, but must meet setup and hold times 
top and to; to insure recognition on any specific 
clock. | | 


an Bus SIZE CONTROL (BS16#, BS8#) 


The BS16# and BS8¥ inputs allow external 16- and 
8-bit busses to be supported with a small number of 
external components. The 486 CPU samples these 
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pins every clock. The value sampled in the clock 
before ready determines the bus size. When assert- 
ing BS16# or BS8# only 16 or 8 bits of the data bus 
need be valid. If both BS16# and BS8# are assert- 
ed, an 8-bit bus width is selected. 


When BS16# or BS8# are asserted the 486 micro- 
processor will convert a larger data request to the 
appropriate number of smaller transfers. The byte 
enables will also be modified appropriately for the 
bus size selected. 


BS16# and BS8# are active LOW and are provided 
with small internal pullup resistors. BS16# and 
BS8# must satisfy the setup and hold times t;,4 and 
ty5 for proper chip operation. 


6.2.15 ADDRESS BIT 20 MASK (A20M#) 


Asserting the A20M# input causes the 486 micro- 
processor to mask physical address bit 20 before 
performing a lookup in the internal cache and before 
driving a memory cycle to the outside world. When 
A20M# is asserted, the 486 microprocessor emu- 
lates the 1 Mbyte address wraparound that occurs 
on the 8086. A20M# is active LOW and must be 
asserted only when the processor is in real mode. 
The A20M# is not defined in Protected Mode. 
A20M # is asynchronous but should meet setup and 
hold times tap and to; for recognition in any specific 
clock. For correct operation of the chip, A2ZOM# 
should be sampled high 2 clocks before and 2 
clocks after RESET goes low. | 


6.3 Write Buffers 


The 486 microprocessor contains four write buffers 
to enhance the performance of consecutive writes 
to memory. The buffers can be filled at a rate of one 
write per clock until all four buffers are filled. 


When all four buffers are empty and the bus is idle, a 
write request will propagate directly to the external 
bus bypassing the write buffers. If the bus is not 
available at the time the write is generated internally, 
the write will be placed in the write buffers and prop- 
agate to the bus as soon as the bus becomes avail- 
able. The write is stored in the on-chip cache imme- 
diately if the write is a cache hit. 


Writes will be driven onto the external bus in the 
same order in which they are received by the write 
buffers. Under certain conditions a memory read will 
go onto the external bus before the memory writes 
pending in the buffer even though the writes oc- 
curred earlier in the program execution. | 


A memory read will only be reordered in front of all 
writes in the buffers under the following conditions: If 
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all writes pending in the buffers are cache hits and 
the read is a cache miss. Under these conditions the 
486 microprocessor will not read from an external 
memory location that needs to be updated by one of 
the pending writes. 


Reordering of a read with the writes pending in the 
buffers can only occur once before all the buffers 
are emptied. Reordering read once only maintains 
cache consistency. Consider the following example: 
The CPU writes to location X. Location X is in the 
internal cache, so it is updated there immediately. 
However, the bus is busy so the write out to main 
memory is buffered (see Figure 6.3(a)). At this point, 
any reads to location X would be cache hits and 
most up-to-date data would be read. 


i486 CPU Cache Write Buffer 


Main Memory 


xX xX data x 


data y 


new data x new data x 


Figure 6.3(a) 


The next instruction causes a read to location Y. 
Location Y is not in the cache (a cache miss). Since — 
the write in the write buffer is a cache hit, the read is 
reordered. When location Y is read, it is put into the 
cache. The possibility exists that location Y will re- 
place location X in the cache. If this is true, location 
X would no longer be cached (see Figure 6.3(b)). 


i486 CPU Cache Write Buffer Main Memory 


X| new data x data x 


data y 


Figure 6.3(b) 


Cache consistency has been maintained up to this 
point. If a subsequent read is to location X (now a 
cache miss) and it was reordered in front of the buff- 
ered write to location X, stale data would be read. 
This is why only 1 read is allowed to be reordered. 
Once a read is reordered, all the writes in the write 
buffer are flagged as cache misses to ensure that no 
more reads are reordered. Since one of the condi- 
tions to reorder a read is that all writes in the write 
buffer must be cache hits, no more reordering is al- 
lowed until all of those flagged writes propogate to 
the bus. Similarly, if an invalidation cycle is run all 
entries in the write buffer are flagged as cache 
misses. 
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For multiple processor systems and/or systems us- 
ing DMA techniques, such as bus snooping, locked 
semaphores should be used to maintain cache con- 
sistency. oo - 


6.3.1 WRITE BUFFERS AND I/O CYCLES 


Input/Output (1/O) cycles must be handled in a dif- 
ferent manner by the write buffers. — 


|/O reads are never reordered in front of buffered 
memory writes. This insures that the 486 microproc- 
essor will update all memory locations before read- 
ing status from an I/O device. 


The 486 microprocessor never buffers single |/O 
writes. When processing an OUT instruction, internal 
execution stops until the 1/O write actually com- 
pletes on the external bus. This allows time for the 
external system to drive an invalidate into the 486 
microprocessor or to mask interrupts before the 
processor progresses to the instruction following 
OUT. REP OUTS instructions will be buffered. 


i/O device recovery time must be handled siightiy 
differently by the 486 microprocessor than with the 
386 microprocessor. I/O device back-to-back write 
recovery times could be guaranteed by the 386 mi- 
croprocessor by inserting a jump to the next instruc- 
tion in the code that writes to the device. The jump 
forces the 386 microprocessor to generate a pre- 
fetch bus cycle which can’t begin until the I/O write 
completes. | 


Inserting a jump to the next write will not work with 
the 486 microprocessor because the prefetch could 
be satisfied by the on-chip cache. A read cycle must 
be explicitly generated to a non-cacheable location 
in memory to guarantee that a read bus cycle is per- 
formed. This read will not be allowed to proceed to 
the bus until after the |/O write has completed be- 
cause |/O writes are not buffered. The I/O device 
will have time to recover to accept another write dur- 
ing the read cycle. 
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6.3.2 WRITE BUFFERS IMPLICATIONS ON 
LOCKED BUS CYCLES 


Locked bus cycles are used for read-modify-write 
accesses to memory. During a read-modify-write ac- 
cess, a memory base variable is read, modified and 
then written back to the same memory location. It is 


important that no other bus cycles, generated by 


other bus masters ior by the 486 microprocessor it-. 
self, be allowed on the external bus between the 


. read and write portion of the locked sequence. | 


During a locked read cycle the 486 microprocessor 
will always access external memory, it will never 
look for the location in the on-chip cache, but for 
write cycles, data is written in the internal cache (if 
cache hit) and in the external memory. All data 
pending in the 486 microprocessor’s write buffers 
will be written to memory before a locked cycle is 
allowed to proceed to the external bus. 


The 486 microprocessor will assert the LOCK# pin 
after the write buffers are emptied during a locked 
bus cycle. With the LOCK # pin asserted, the micro- 
processor will read the data, operate on the data 
and place the results in a write buffer. The contents 
of the write buffer will then be written to external 
memory. LOCK # will become inactive after the write 
part of the locked cycle. 


6.4 Interrupt and Non-Maskable 
Interrupt Interface 


The 486 microprocessor provides two asynchronous 
interrupt inputs, INTR (interrupt request) and NMI 
(non-maskable interrupt input). This section de- 
scribes the hardware interface between the instruc- 
tion execution unit and the pins. For a description of 
the algorithmic response to interrupts refer to Sec- 
tion 2.7. For interrupt timings refer to Section 7.2.10. | 


6.4.1 INTERRUPT LOGIC 


The 486 microprocessor contains a two-clock syn- 
chronizer on the interrupt line. An interrupt request 
will reach the internal instruction execution unit two 
clocks after the INTR pin is asserted, if proper setup 


_is provided to the first stage of the synchronizer. 
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There is no special logic in the interrupt path other 
than the synchronizer. The INTR signal is level sen- 
sitive and must remain active for the instruction exe- 
cution unit to recognize it. The interrupt will not be 
serviced by the 486 microprocessor if the INTR sig- 
nal does not remain active. 


The instruction execution unit will look at the state of 
the synchronized interrupt signal at specific clocks 
during the execution of instructions (if interrupts are 
enabled). These specific clocks are at instruction 
boundaries, or iteration boundaries in the case of 
string move instructions. Interrupts will only be ac- 
cepted at these boundaries. 


An interrupt must be presented to the 486 micro- 
processor INTR pin three clocks before the end of 
an instruction for the interrupt to be acknowledged. 
Presenting the interrupt 3 clocks before the end of 
an instruction allows the interrupt to pass through 
the two clock synchronizer leaving one clock to pre- 
vent the initiation of the next sequential instruction 
and to begin interrupt service. If the interrupt is not 
received in time to prevent the next instruction, it will 
be accepted at the end of next instruction, assuming 
INTR is still held active. The interrupt service micro- 
code will start after two dead clocks. 


The longest latency between when an interrupt re- 
quest is presented on the INTR pin and when the 
interrupt service begins is: longest instruction used 
+ the two clocks for synchronization + one clock 
required to vector into the interrupt service micro- 
code. 


6.4.2 NMI LOGIC 


The NMI pin has a synchronizer like that used on the 
INTR line. Other than the synchronizer, the NMI log- 
ic is different from that of the maskable interrupt. 


NMI is edge triggered as opposed to the level trig- 
gered INTR signal. The rising edge of the NMI signal 
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is used to generate the interrupt request. The NMI 
input need not remain active until the interrupt is ac- 
tually serviced. The NMI pin only needs to remain 
active for a single clock if the required setup and 
hold times are met. NMI will operate properly if it is 
held active for an arbitrary number of clocks. 


The NMI input must be held inactive for at least four 
clocks after it is asserted to reset the edge triggered 
logic. A subsequent NMI may not be generated if the 
NMI is not held inactive for at least two clocks after 
being asserted. 


The NMI input is internally masked whenever the 
NMI routine is entered. The NMI input will remain 
masked until an IRET (return from interrupt) instruc- 
tion is executed. Masking the NMI signal prevents 
recursive NMI calls. If another NMI occurs while the 
NMI is masked off, the pending NMI will be executed 
after the current NMI is done. Only one NMI can be 
pending while NMI is masked. 


6.5 Reset and Initialization 


The 486 microprocessor has a built in self test 
(BIST) that can be run during reset. The BIST is in- 
voked if the AHOLD pin is asserted for 2 clocks be- 
fore and 2 clocks after RESET is deasserted. RE- 
SET must be active for 15 clocks with or with no 
BIST being enabled. Refer to Section 8.0 for infor- 
mation on 486 microprocessor testability. 


The 486 microprocessor registers have the values 
shown in Table 6.2 after RESET is performed. The 


_ EAX register contains information on the success or 


failure of the BIST if the self test is executed. The 
DX register always contains a component identifier 


at the conclusion of RESET. The upper byte of DX 


(DH) will contain 04 and the lower byte (DL) will con- 
tain a stepping identifier (see Table 6-3). The floating 
point registers are initialized as if the FINIT/FNINIT 
(initialize processor) instruction was executed if the 
BIST was performed. If the BIST is not executed, the 


floating point registers are unchanged. 
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Table 6.2. Register Values after Reset 


Register | 
EAX 
ECX 
EDX 
EBX 
ESP 
EBP 

_ ESI 
EDI 
EFLAGS 
EIP 
ES 
CS 
SS 
DS 

..FS 
GS 
IDTR 
CRO 
DR7 


(BIST) 


.Zero (Pass) 
Undefined 
0400 + Revision 
Undefined 
Undefined 
_ Undefined 
- Undefined 
Undefined 
00000002h 
OFFFOh 
0000h 
FOOOh* 
0000h 
0000h 
0000h | 
0000h 


60000010h 
00000000h 


037Fh 
0000h 
FFFFh 
00000000h 
00000000h 
0000h 
0000h 
oo00h 
Undefined 


CW 

SW 

TW 

FIP 

FEA 

FCS 

FDS 
FOP 
FSTACK 


Table 6-3. i486™ CPU Revision ID 


i486T™ CPU 


Stepping Name . Revision ID 


The 486 microprocessor will start executing instruc- 
tions at location FFFFFFFOH after RESET. When 
the first InterSegment Jump or Call is executed, ad- 
dress lines A20-—A31 will drop LOW for CS-relative 
memory cycles, and the 486 microprocessor will 
only execute instructions in the lower one Mbyte of 
physical memory. This allows the system designer to 
use a ROM at the top of physical memory to initialize 
the system and take care of RESETs. 


RESET forces the 486 microprocessor to terminate 
all execution and local bus activity. No instruction or 
bus activity will occur as long as RESET is active. 


Initial Value 


ID 


Base = 0, Limit = 3FFh 
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Initial Value 
. (No Bist) 
. Undefined 
Undefined 
0400 + Revision ID 
Undefined 
Undefined | 
Undefined. 
Undefined 
Undefined 
- 00000002h 
OFFFOh 
0000h 
FOOOh* 
6000h 
0000h | 
0000h 
o0000h | 
Base = 0, Limit = 3FFh 
60000010h 
00000000h 


Unchanged 
Unchanged 
Unchanged 
Unchanged 
Unchanged 
Unchanged 
Unchanged 
Unchanged . 
Unchanged 


All entries in the cache are invalidated by RESET. 


6.5.1 PIN STATE DURING RESET 


The 486 microprocessor recognizes and can re- 
spond to HOLD, AHOLD, and BOFF # requests re- 
gardless of the state of RESET. Thus, even though 
the processor is in reset, it can still float its bus in 
response to any of these requests. 


While in reset, the 486 microprocessor bus is in the 
state shown in Figure 6.4 if the HOLD, AHOLD and 
BOFF # requests are inactive. Note that the address 
(A31-A2, BE3#-BE0#) and cycle definition 
(M/IO#, D/C#, W/R#) pins are undefined from the 
time reset is asserted up to the start of the first bus 
cycle. All undefined pins (except FERR#) assume 
known values at the beginning of the first bus cycle. 
The first bus cycle is always a code fetch to address 
FFFFFFFOH. FERR# reflects the state of the ES 
(error summary status) bit in the floating point unit 
status word. The ES bit is initialized whenever the 
floating point unit state is initialized. 


L6-S 
L3S3u Burnp sajzejs uld “pg aunbi4 


Ty 7 Tx o> dat Ty. a om Ty 7 ul yj 


At least 15 CLK periods “217 CLKs If no self-test 
RI at [T[} © ANN | oo CLKs if self-test 


too too 


Sy 77 / a NN 


FLUSH# 


(AN 


AHOLD” 


09 ar 
BQ —sSsSN AN 


Asy-Ay, MIO#, BLAST Deen UNDEFINED 


BEO-BE3#, PWT, PCD | _ _— 
Az, Ap, PLOCK# _ . UNDEFINED 
D/C#, W/R# | | 
PCHK#, 
LOCK# 


D3} “Do, @oeq@geee gee @eeeeees8seeaeeeoeoeoeeeoeoeeueoeeeoeeqeqeeaeeeeeeeeeeeeeeeee@eeeeeq@e 
DPO=3 cana >>?) | 


HLDA 


NOTES: 
1. RESET is an asynchronous input. tag must be met only to guarantee recognition on a specific clock edge. 
2. High for 2 CLKs before and 2 CLKs after RESET goes inactive, for correct operation of the part. 


ies 


Ty 


Inputs 


Outputs 


240440-32 


3. Low for 2 CLKs before and 2 CLKs after RESET goes inactive, if tri-state output test mode is to be entered. All outputs are generated tri-stated within 10 CLKs of 


RESET being ‘deasserted. 
4. High for 2 CLKs before and 2 CLKs after RESET goes inactive, to initiate self-test. 
5. Hold is recognized normally during RESET. 


6. 15 CLKs RESET pulse width for warm resets. Power-up resets require RESET to be asserted for at least 1 ms after Voc and CLK are stable. 
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7.0 BUS OPERATION 


7.1 Data Transfer Mechanism 


All data transfers occur as a result of one or more 
bus cycles. Logical data operands of byte, word and 


dword lengths may be transferred without restric- _ 


_ tions on physical address alignment. Data may be 
accessed at any byte boundary but two or three cy- 
cles may be required for unaligned data transfers. 
See Section 7.1.3 Dynamic Bus Sizing and 7.1.6 Op- 
erand Alignment. . 


The 486 microprocessor address signals are split 
into two components. High-order address bits are 
provided by the address lines, A2—-A31. The byte 
enables, BEO#-BE3#, form the low-order address 
and provide linear selects for the four bytes of the 
32-bit address bus. | 


The byte enable outputs are asserted when their as- 


sociated data bus bytes are involved with the pres- 


ent bus cycle, as listed in Table 7.1. Byte enable 
patterns which have a negated byte enable separat- 
ing two or three asserted byte enables will never 


occur (see Table 7.5). All other byte enable patterns - 


_. are possible. 


Table 7.1. Byte Enables and Associated 
Data and Operand Bytes 


Byte 
Enable Associated Data Bus Signals 
Signal : ; 


DO-D7 _(byte O—least significant) 


D8-D15 (byte 1) | 
D16-D23 (byte 2) 


D24-D31 (byte 3—most significant) | 
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Address bits AO and A1 of the physical operand’s 
base address can be created when necessary. Use 
of the byte enables to create AO and A1 is shown in 
Table 7.2. The byte enables can also be decoded to 
generate BLE# (byte low enable) and BHE# (byte 
high enable). These signals are needed to address 
16-bit memory systems (see Section 7.1.4 Inter- 
facing with 8- and 16-bit memories). 


Table 7.2. Generating A0O-—A31 from 
BEO#-BE3# and A2-A31 


486T™ CPU Address Signals 
AST c.cicces, 


3 

. Physical Base 
Address 

31 


nes 
nal fo] x | ow 


7.1.1 MEMORY AND I/O SPACES 


Bus cycles may access physical memory space or 
I/O space. Peripheral devices in the system may ei- 


High 
High | 


_ ther be memory-mapped, or |/O-mapped, or both. 


Physical memory addresses range from O0QQ0000H 
to FFFFFFFFH (4 gigabytes). I/O addresses range 
from OOO000000H to OOOOFFFFH (64 Kbytes) for pro- 
grammed |/O. See Figure 7.1. | 
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FFFFFFFFH 


PHYSICAL 
MEMORY 


4 GBYTE 


ot 


Wa 
ACCESSIBLE 


[ ssxore | kK BYTE PROGRAMMED 
\/O SPACE 


[canon | 


00000000H 00000000H 240440-33 


Physical Memory Space |/O Space 


Figure 7.1. Physical Memory and I/O Spaces 


7.1.2 MEMORY AND I/O SPACE aan sce Ae 
ORGANIZATION 32-Bit Wide Organization 


a FFFFFFFFH FFFFFFFCH 
The 486 microprocessor datapath to memory and 


input/output (I/O) spaces can be 32-, 16- or 8-bits 
wide. The byte enable signals, BEO#-BE3#, allow 
byte granularity when addressing any memory or I/O QOO00000SH | » 00000000H 
structure whether 8, 16 or 32 bits wide. BE3# BE2# BE1# BEO# 

240440-34 
The 486 microprocessor includes bus control pins, 
BS16# and BS8#, which allow direct connection to Fceouas re 
16- and 8-bit memories and I/O devices. Cycles to 16-Bit Wide Organization 
32-, 16- and 8-bit may occur in any sequence, since FFFFFFFFH FFFFFFFEH 
the BS8# and BS16# signals are sampled during 
each bus cycle. 


32-bit wide memory and |/O spaces are organized 
as arrays of physical 4-byte words. Each memory or 
I/O 4-byte word has four individually addressable 
bytes at consecutive byte addresses (see Figure 


* 00000001H 00000000H 
ie ee | 
BHE# BLE# 
7.2). The lowest addressed byte is associated with es 
data signals DO-D7; the highest-addressed byte 
with D24—D31. Physical 4-byte words begin at ad- 
dresses divisible by four. 


Figure 7.2. Physical Memory 
and I/O Space Organization 
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16-bit memories are organized as arrays of physical 
2-byte words. Physical 2-byte words begin at ad- 
dresses divisible by two. The byte enables BEO # - 
BE3#, must be decoded to A1, BLE# and BHE# to 
address 16-bit memories (see Section 7.1.4). 


To address 8-bit memories, the two low order ad- 
dress bits AO and A1, must be decoded from BEO # — 
BE3#. The same logic can be used for 8- and 16-bit 
memories since the decoding logic for BLE# and AO 
are the sarne (see Section 7.1.4). 


7.1.3 DYNAMIC DATA BUS SIZING 


Dynamic data bus sizing is a feature allowing proc- 
essor connection to 32-, 16- or 8-bit buses for mem- 
ory or I/O. A processor may connect to all three bus 
sizes. Transfers to or from 32-, 16- or 8-bit devices 
are supported by dynamically determining the bus 
width during each bus cycle. Address decoding cir- 
cuitry may assert BS16# for 16-bit devices, or 
BS8 # for 8-bit devices during each bus cycle. BS8 # 
and BS16# must be negated when addressing 32- 
bit devices. An 8-bit bus width is selected if both 
BS16# and BS8# are asserted. 


BS16# and BS8# force the 486 microprocessor to 
run additional bus cycles to complete requests larg- 
er than 16- or 8 bits. A 32-bit transfer will be convert- 
ed into two 16-bit transfers (or 3 transfers if the data 
is misaligned) when BS16# is asserted. Asserting 
BS8# will convert a 32-bit transfer into four 8-bit 
transfers. , 


Extra cycles forced by BS16# or BS8# should be 
viewed as independent bus cycles. BS16# or BS8# 
must be driven active during each of the extra cycles 
unless the addressed device has the ability to 
change the number of bytes it can return between 
cycles. . 
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The 486 microprocessor will drive the byte enables 
appropriately during extra cycles forced by BS8#: 
and BS16#. A2-—A31 will not change if accesses are» 
to a 32-bit aligned area. Table 7.3 shows the set of 

byte enables that will be generated on the next cycle | 
for each of the valid possibilities of the byte enables 
on the current cycle. 


The dynamic bus sizing feature of the 486 micro- 
processor is significantly different than that of the 
386 microprocessor. Unlike the 386 microprocessor, 
the 486 microprocessor requires that data bytes be 
driven on the addressed data pins. The simplest ex- 
ample of this function is a 32-bit aligned, BS16# 
read. When the 486 microprocessor reads the two 
high order bytes, they must be driven on the data 
bus pins Di6-D31. The 486 microprocessor ex- 
pects the two low order bytes on DO-D15. The 386 
microprocessor expects both the high and low order 
bytes on DO-D15. The 386 microprocessor always 
reads or writes data on the lower 16 bits of the data 
bus when BS16# is asserted. 


The external system must contain buffers to enable 
the 486 microprocessor to read and write data on 
the appropriate data bus pins. Table 7.4 shows the 
data bus lines where the 486 microprocessor ex- 
pects data to be returned for each valid combination 
of byte enables and bus sizing options. 


Valid data will only be driven onto data bus pins cor- 
responding to active byte enables during write cy- 
cles. Other pins in the data bus will be driven but 
they will not contain valid data. Unlike the 386 micro- 
processor, the 486 microprocessor will not duplicate 
write data onto parts of the data bus for which the 
corresponding byte enable is negated. _ 


_ Table 7.3. Next Byte Enable Values for BSn# Cycles 


. Current 
BE3# BE2# BE1# BE0# 


1 1 - oO. n n 
1 1 0 —~0 1 1 
{ 0 0 0 1 0 
0 0 0 0 0 0 
1 . 0 1 n n 
1 0 07 1 1 0 
0 0 0 1 0 0 
1 0 1 1 n n 
0 0 1. 1 0 1 
0 1 1 1 n n 


| Next with BS8 # Next with BS16 # | 
BE3# BE2# BE1# BEO# | BE3# BE2#  BE1# BEO# 


‘“n” means that another bus cycle will not be required to satisfy the request. 


ne n n n n n 
oO 1 ne n n n 
0 1 1 omy 1 1 
0 1 0 0 1 1 
n n n n n n 
1 1 1 oO 1 4 
1 1 0 0 1 1 
n n. n n n n 
4 1 n n n n 
n n n n n n 
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Table 7.4. Data Pins Read with Different Bus Sizes 


BE1#  BEO# | w/oBS8#/BS16# W BS16# 


D7-DO D7-DO D7-D0 
D15-—D0 D7-D0 D15-—D0 
D23-D0 D7-DO0 D15-D0 
D31-—D0 D7-DO0 D15—D0 


BE3# 


BE2# 


—_ 
_ 
© 


—-~O 00000 -+- 
—a- = O0C0O0 00 + 
os ee re ee ee ee Se ae a a> I GD) 


oo Oo =~ = O = —- 


7.1.4 INTERFACING WITH 8-, 16- AND 32-BIT 
MEMORIES 


In 32-bit physical memories such as Figure 7.3, each 
4-byte word begins at a byte address that is a multi- 
ple of four. A2—A31 are used as a 4-byte word se- 
lect. BEO#-BE3# select individual bytes within the 
4-byte word. BS8# and BS16# are negated for all 
bus cycles involving the 32-bit array. 


32, DATA BUS (D0=D31) 


486™ 32=BiT 
cpu | ADDRESS BUS (BEO#=BE3#,A2—-A31) _ | MEMORY 


"HIGH" "HIGH" 
240440-36 


Figure 7.3. i486™ Microprocessor 
with 32-Bit Memory | 


Address 
Decode 


- BEO#=BE34 


D15-D8 
D23-D8 
D31-D8 


D15-—D8 
D15-—D8 
D15-D8 


D15-D8 
D15-D8 
D15-D8 


D23-D16 
D31-D16 
D31-D24 


D23-D16 
D23-D16 
D31-D24 


D23-D16 
D31-D16 
D31-D24 


16- and 8-bit memories require external byte swap- 
ping logic for routing data to the appropriate data 
lines and logic for generating BHE#, BLE# and Al. 
In systems where mixed memory widths are used, 
extra address decoding logic is necessary to assert 
BS16# or BS8#. 


Figure 7.4 shows the 486 microprocessor address 
bus interface to 32-, 16- and 8-bit memories. To ad- 
dress 16-bit memories the byte enables must be 
decoded to produce A1, BHE# and BLE# (AO). For 
8-bit wide memories the byte enables must be de- 
coded to produce AO and A1. The same byte select 
logic can be used in 16- and 8-bit systems since 
BLE# is exactly the same as AO (see Table 7.5). 


BEO#-BE3# can be decoded as shown in Table 
7.5 to generate A1, BHE# and BLE#. The byte se- 
lect logic necessary to generate BHE # and BLE# is 
shown in Figure 7.5. 


Address Bus (A31-A2 BEO#-BE3#) 


A31—A2 


Byte : 
Select Logic 


16=Bit 
Memory 


AO(BLE#), A1 
8=Bit 
Memory 


240440-37 


Figure 7.4. Addressing 16- and 8-Bit Memories 
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Table 7.5. Generating A1, BHE# and BLE # for Addressing 16-Bit Devices 


i486T™ CPU Signals Pe 8, 16-Bit Bus Signals 
[Teese [ pez» [pete [ peor | ai | BHE® | BLE* (AO) 


PS Eee SE ee Ee aS 


BLE# asserted when DO-—D7 of 16-bit bus is active. 
BHE # asserted when D8-—D15 of 16-bit bus is active. 
A1 low for all even words; A1 high for all odd words. 


Key: | 
. = don’t care 
= high voltage level 


x 
H 
L = low voltage level 
* 


240440-38 


= anon-occurring pattern of Byte Enables; either none are asserted, 
or the pattern has Byte Enables asserted for non-contiguous bytes 


x—no active bytes 


x—not contiguous bytes 


x—not contiguous bytes 
‘x—not contiguous bytes 
x—not contiguous bytes 


x—not contiguous bytes . 


mmx rmKX x xX Cemex Tremere xX 
mUoxKx mK KK Teo kK ere aie x 


240440-40 


Figure 7.5. Logic to Generate A1, BHE # and BLE# for 16-Bit Busses 


Combinations of BEO#-—BE3# which never occur | 


are those in which two or three asserted byte en- 
ables are separated by one or more negated byte 
enables. These combinations are “don’t care” con- 
ditions in the decoder. A decoder can use the non- 
occurring BEO # -BE3# combinations to its best ad- 
vantage. 5s | 


Figure 7.6 shows a 486 microprocessor data bus in- 
terface to 16- and 8-bit wide memories. External 
byte swapping logic is needed on the data lines so 
that data is supplied to, and received from the 486 


- microprocessor on the correct data pins (see Table 
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16=Bit 
- Memory 


240440-74 


Figure 7.6. Data Bus Interface to 16- and 8-bit Memories 


7.1.5 DYNAMIC BUS SIZING DURING CACHE 
LINE FILLS 


BS8# and BS16# can be driven during cache line 
fills. The 486 microprocessor will generate enough 
8- or 16-bit cycles to fill the cache line. This can be 
up to 16 8-bit cycles. 


The external system should assume that all byte en- 
ables are active for the first cycle of a cache line fill. 
The 486 microprocessor will generate proper byte 
enables for subsequent cycles in the line fill. Table 
7.6 shows the appropriate AO (BLE#), A1 and 
BHE# for the various combinations of the 486 mi- 
croprocessor byte enables on both the first and sub- 
sequent cycles of the cache line fill. The “*”’ marks 
all combinations of byte enables that will be generat- 
ed by the 486 microprocessor during a cache line fill. 


7.1.6 OPERAND ALIGNMENT 


Physical 4-byte words begin at addresses that are 
multiples of four. It is possible to transfer a logical 
operand that spans more than one physical 4-byte 
word of memory or I/O at the expense of extra cy- 
cles. Examples are 4-byte operands beginning at ad- 
dresses that are not evenly divisible by 4, or 2-byte 
words split between two physical 4-byte words. 
These are referred to as unaligned transfers. 


Operand alignment and data bus size dictate when 
multiple bus cycles are required. Table 7.7 describes 
the transfer cycles generated for all combinations of 
logical operand lengths, alignment, and data bus siz- 
ing. When multiple cycles are required to transfer a 
multi-byte logical operand, the highest-order bytes 
are transferred first. For example, when the proces- 
sor does a 4-byte unaligned read beginning at loca- 
tion x11 in the 4-byte aligned space, the three high 
order bytes are read in the first bus cycle. The low 
byte is read in a subsequent bus cycle. 


Table 7.6. Generating AO, A1 and BHE # from the i486™ Microprocessor Byte Enables 


BE3# BE2# BE1# BEO# 


“ae SH ODOT 90O00 + 
aha aot ot tt = COC CO O 


-Oo0oO0ocoT_o00c0- — 


; 

' 

; 
*0 
*0 
*0 
*0 


First Cache Fill Cycle 
AO 


ooo oo 00 00 0 


on 
© 
“I 


Any Other Cycle 


BHE # Ai BHE# 


Al AO 


©eo000000OOC°O 
20000 OCC 0O 
4“OO0O-==4 C000 
“~--~O000000 
[2O-Cc0000O= 
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Table 7.7. Transfer Bus Cycles for Bytes, Words and Dwords 


_ Byte-Length of Logical Operand 


Transfer Cycles over 
16-Bit Data Bus 
= BS16# Asserted 


_ Transfer Cycles over 
8-Bit Data Bus 
= BS8# Asserted 


KEY: 

b = byte transfer 
w = 2-byte transfer 
‘3 = 3-byte transfer 
d = 4-byte transfer 


high-order portion 
low-order portion 


h= 
| = 
m = mid-order portion 


The function of unaligned transfers with dynamic 
bus sizing is not obvious. When the external systems 
asserts BS16# or BS8# forcing extra cycles, low- 
order bytes or words are transferred first (opposite 
to the example above). When the 486 microproces- 
sor requests a 4-byte read and the external system 
asserts BS16#, the lower 2 bytes are read first fol- 
lowed by the upper 2 bytes. 


In the unaligned transfer described above, the proc- 
essor requested three bytes on the first cycle. If the 
external system asserted .BS16# during this 3-byte 
transfer, the lower word is transferred first followed 
by the upper byte. In the final cycle the lower byte of 
the 4-byte operand is transferred as in the 32-bit ex- 
. ample above. | | 


7.2 Bus Functional Description 


‘The 486 microprocessor supports a wide variety of 
bus transfers to meet the needs of high performance 
systems. Bus transfers can be single cycle or multi- 
ple cycle, burst or non-burst, cacheable or non- 
‘cacheable, 8-, 16- or 32-bit, and pseudo-locked. To 
‘support multiprocessing systems there are cache in- 
validation cycles and locked cycles. 
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byte with byte with 
lowest highest 
_address ' . address 


This section begins with basic non-cacheable non- 
burst single cycle transfers. It moves on to. multiple 
cycle transfers and introduces. the burst mode. 
Cacheability is introduced in Section 7.2.3. The re- 
maining sections describe locked, pseudo-locked, 
invalidate, bus hold and interrupt cycles. 


Bus cycles and data cycles are discussed in this 
section. A bus cycle is at least two clocks long and 
begins. with ADS # active in the first clock and ready 
active in the last clock. Data is transferred to or from 
the 486 microprocessor during a data cycle. A bus 
cycle contains one or more data cycles. 


Refer to Section 7.2.13 for a description of the bus 
states shown in the timing diagrams. 


7.2.1 NON-CACHEABLE NON-BURST SINGLE 
CYCLE 


7.2.1.1 No Wait States 


The fastest non-burst bus cycle that the 486 micro- 
processor supports is two clocks long. These cycles 
are called 2-2 cycles because reads and writes take 
two cycles each. The first 2 refers to reads and the 
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second to writes. For example, if a wait state needs 
to be added to a write, the cycle would be called 2-3. 


Basic two clock read and write cycles are shown in 


Figure 7.7. The 486 microprocessor initiates a cycle 


by asserting the address status signal (ADS #) at the 
rising edge of the first clock. The ADS# output indi- 
cates that a valid bus cycle definition and address is 
available on the cycle definition lines and address 
bus. 


The non-burst ready input (RDY #) is returned by the 
external system in the second clock. RDY # indi- 
cates that the external system has presented valid 
data on the data pins in response to a read or the 
external system has accepted data in response to a 
write. | 


The 486 microprocessor samples RDY # at the end 
of the second clock. The cycle is complete if RDY # 
is active (LOW) when sampled. Note that RDY # is 
ignored at the end of the first clock of the bus cycle. 


The burst last signal (BLAST #) is asserted (LOW) 
by the 486 microprocessor during the second clock 
of the first cycle in all bus transfers illustrated in Fig- 
ure 7.7. This indicates that each transfer is complete 
after a single cycle. The 486 microprocessor asserts 
BLAST # in the last cycle of a bus transfer. 


The timing of the parity check output (PCHK#) is 
shown in Figure 7.7. The 486 microprocessor drives 
the PCHK # output one clock after ready terminates 
a read cycle. PCHK# indicates the parity status for 
the data sampled at the end of the previous clock. 
The PCHK# signal can be used by the external sys- 
tem. The 486 microprocessor does nothing in re- 
sponse to the PCHK # output. 


7.2.1.2 Inserting Wait States 


The external system can insert wait states into the 
- basic 2-2 cycle by driving RDY # inactive at the end 
of the second clock. RDY # must be driven inactive 
to insert a wait state. Figure 7.8 illustrates a simple 
non-burst, non-cacheable signal with one wait state 
added. Any number of wait states can be added to a 
486 microprocessor bus cycle by maintaining RDY # 
inactive. | 


The burst ready input (BRDY #) must be driven inac- 
tive on all clock edges where RDY # is driven inac- 
tive for proper operation of these simple non-burst 
cycles. 
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7.2.2 MULTIPLE AND BURST CYCLE BUS 
TRANSFERS 


Multiple cycle bus transfers can be caused by inter- 
nal requests from the 486 microprocessor or by the 
external memory system. An internal request for a 
64-bit floating point load or a 128-bit pre-fetch must 
take more than one cycle. Internal requests for un- 
aligned data may also require multiple bus cycles. A 
cache line fill requires multiple cycles to complete. 
The external system can cause a multiple cycle 
transfer when it can only supply 8 or 16 bits per 
cycle. 


Only multiple cycle transfers caused by internal re- 
quests are considered in this section. Cacheable cy- 
cles and 8- and 16-bit transfers are covered in Sec- 
tions 7.2.3 and 7.2.5. 


7.2.2.1 Burst Cycles 


The 486 microprocessor can accept burst cycles for 
any bus requests that require more than a single 
data cycle. During burst cycles, a new data item is 
strobed into the 486 microprocessor every clock 
rather than every other clock as in non-burst cycles. 
The fastest burst cycle requires 2 clocks for the first 
data item with subsequent data items returned every 
Clock. 


The 486 microprocessor is capable of bursting a 
maximum of 32 bits during a write. Burst writes can 
only occur if BS8# or BS16# is asserted. For exam- 
ple, the 486 microprocessor can burst write four 8- 
bit operands or two 16-bit operands in a single burst 
cycle. But the 486 microprocessor cannot burst mul- 
tiple 32-bit writes in a single burst cycle. 


Burst cycles begin with the 486 microprocessor driv- 
ing out an address and asserting ADS # in the same 
manner as non-burst cycles. The 486 microproces- 
sor indicates that it is willing to perform a burst cycle 
by holding the burst last signal (BLAST #) inactive in 
the second clock of the cycle. The external system 
indicates its willingness to do a burst cycle by return- 
ing the burst ready signal (BRDY #) active. 


The addresses of the data items in a burst cycle will 
all fall within the same 16-byte aligned area (corre- 
sponding to an internal 486 microprocessor cache 
line). A 16-byte aligned area begins at location 
XXXXXXXO and ends at location XXXXXXXF. During 
a burst cycle, only BEO-3#, Ao, and Ag may 
change. A4—A31, M/IO#, D/C#, and W/R# will re- 
main stable throughout a burst. Given the first ad- 
dress in a burst, external hardware can easily calcu- 
late the address of subsequent transfers in advance. 
An external memory system can be designed to 
quickly fill the 486 microprocessor internal cache 
lines. 
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Figure 7.7. Basic 2-2 Bus Cycle 
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Figure 7.8. Basic 3-3 Bus Cycle 
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Burst cycles are not limited to cache line fills. Any 
multiple cycle read request by the 486 microproces- 
sor can be converted into a burst cycle. The 486 
microprocessor will only burst the number of bytes 
needed to complete a transfer. For example, eight 
bytes will be bursted in for a 64-bit floating point 
non-cacheable read. 


The external system converts a multiple cycle re- 
quest into a burst cycle by returning BRDY # active 
rather than RDY # (non-burst ready) in the first cycle 
of a transfer. For cycles that cannot be bursted such 
as interrupt acknowledge and halt, BRDY # has the 
same effect as RDY#. BRDY# is ignored if both 
BRDY# and RDY # are returned in the same clock. 
Memory areas and peripheral devices that cannot 
perform bursting must terminate cycles with RDY #. 


7.2.2.2 Terminating Multiple and 
Burst Cycle Transfers 


The 486 microprocessor drives BLAST # inactive for 
all but the last cycle in a multiple cycle transfer. 
BLAST # is driven inactive in the first cycle to inform 
the external system that the transfer could take ad- 
ditional cycles. BLAST # is driven active in the last 
cycle of the transfer indicating that the next time 
BRDY# or RDY# is returned the transfer is com- 
plete. 


BLAST # is not valid in the first clock of a bus cycle. 
It should be sampled only in the second and subse- 
quent clocks when RDY # or BRDY # is returned. 


The number of cycles in a transfer is a function of 
several factors including the number of bytes the mi- 
croprocessor needs to complete an internal request 
(1, 2, 4, 8, or 16), the state of the bus size inputs 
(BS8# and BS16#), the state of the cache enable 
input (KEN #) and alignment of the data to be trans- 
ferred. 


When the 486 microprocessor initiates a request it 
knows how many bytes will be transferred and if the 
data is aligned. The external system must tell the 
microprocessor whether the data is cacheable (if the 
transfer is a read) and the width of the bus by return- 
ing the state of the KEN#, BS8# and BS16# inputs 
one clock before RDY # or BRDY # is returned. The 
486 microprocessor determines how many cycles a 
transfer will take based on its internal information 
and inputs from the external system. 


BLAST # is not valid in the first clock of a bus cycle 
because the 486 microprocessor cannot determine 
the number of cycles a transfer will take until the 
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external system returns KEN#, BS8# and BS16#. 
BLAST # should only be sampled in the second and 
subsequent clocks of a cycle when the external sys- 
tem returns RDY # or BRDY #. 


The system may terminate a burst cycle by returning 
RDY# instead of BRDY#. BLAST# will remain 
deasserted until the last transfer. However, any 
transfers required to complete a cache line fill will 
follow the burst order, e.g., if burst order was 4, 0, C, 
8 and RDY # was returned at after 0, the next trans- 


fers will be from C and 8. 


7.2.2.3 Non-Cacheable, Non-Burst, Multiple 
Cycle Transfers — 


Figure 7.9 illustrates a 2 cycle non-burst, non-cache- 
able multiple cycle read. This transfer is simply a 
sequence of two single cycle transfers. The 486 mi- 
croprocessor indicates to the external system that 
this is a multiple cycle transfer by driving BLAST # 
inactive during the second clock of the first cycle. 
The external system returns RDY # active indicating 
that it will not burst the data. The external system 
also indicates that the data is not cacheable by re- 
turning KEN# inactive one clock before it returns 
RDY # active. When the 486 microprocessor sam- 
ples RDY # active it ignores BRDY #. 


‘Each cycle in the transfer begins when ADS# is 


driven active and the cycle is complete when the 
external system returns RDY # active. 


The 486 microprocessor indicates the last cycle of 
the transfer by driving BLAST# active. The next 
RDY # returned by the external system terminates 
the transfer. 


7.2.2.4 Non-Cacheable Burst Cycles 


The external system converts a multiple cycle re- 
quest into a burst cycle by returning BRDY # active 
rather than RDY # in the first cycle of the transfer. 
This is illustrated in Figure 7.10. 


There are several features to note in the burst read. 
ADS # is only driven active during the first cycle of 
the transfer. RDY# must be driven inactive when 
BRDY # is returned active. 


BLAST # behaves exactly as it does in the non-burst 
read. BLAST # is driven inactive in the second clock 
of the first cycle of the transfer indicating more cy- 
cles to follow. In the last cycle, BLAST# is driven 
active telling the external memory system to end the 
burst after returning the next BRDY #. 
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Figure 7.9. Non-Cacheable, Non-Burst, Multiple Cycle Transfers 
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Figure 7.10. Non-Cacheable Burst Cycle 
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7.2.3 CACHEABLE CYCLES 


Any memory read can become a cache fill operation. 
The external memory system can allow a read re- 
quest to fill a cache line by returning KEN# active 
one clock before RDY # or BRDY# during the first 
cycle of the transfer on the external bus. Once 
KEN # is asserted and the remaining three require- 
ments described below are met, the 486 microproc- 
essor will fetch an entire cache line regardless of the 
state of KEN#. KEN# must be returned active in 
the last cycle of the transfer for the data to be writ- 
ten into the internal cache. The 486 microprocessor 


will only convert memory reads or prefetches into a 


cache fill. 


KEN # is ignored during write or I/O cycles. Memory | 


writes will only be stored in the on-chip cache if 
there is a cache hit. |/O space is never cached in 
the internal cache. 


To transform a read or a prefetch into a cache line 
fill the following conditions must be met: 


1. The KEN# pin must be asserted one clock pri- 
or to RDY# or BRDY# being returned for the 
first data cycle. 


2. The cycle must be of the type that can be inter- 
nally cached. (Locked reads, I/O reads, and in- 
terrupt acknowledge cycles are never cached). 


3. The page table entry must have the page cache 
disable bit (PCD) set to 0. To cache a page 
table entry, the page directory must have 
PCD=0. To cache reads or prefetches when 
paging is disabled, or to cache the page direc- 
tory entry, control register 3 (CR3) must have 
PCD=0. 


4. The cache disable (CD) bit in control register 0 
(CRO) must be clear. 


External hardware can determine when the 486 mi-— 


croprocessor has transformed a read or prefetch 
into a cache fill by examining the KEN#, M/IO#, 
D/C#, W/R#, LOCK#, and PCD pins. These pins 
convey to the system the outcome of conditions 1-3 
in the above list. In addition, the 486 drives PCD high 
whenever the CD bit in CRO is set, so that external 
hardware can evaluate condition 4. 


Cacheable cycles can be burst or non-burst. 
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7.2.3.1 Byte Enables during a Cache Line Fill 


For the first cycle in the line fill, the state of the byte 
enables should be ignored. In a non-cacheable 
memory read, the byte enables indicate the bytes 
actually required by the memory or code fetch. 


The 486 microprocessor expects to receive valid 
data on its entire bus (32 bits) in the first cycle of a 
cache line fill. Data should be returned with the as- 
sumption that all the byte enable pins are driven ac- 
tive. However if BS8# is asserted only one byte 
need be returned on data lines DO—D7. Similarly if 
BS16# is asserted two bytes should be returned on 
DO-D15. 


The 486 microprocessor will generate the addresses 
and byte enables for all subsequent cycles in the 
line fill. The order in which data is read during a line 
fill depends on the address of the first item read. 
Byte ordering is discussed in Section 7.2.4. 


7.2.3.2 Non-Burst Cacheable Cycles 


Figure 7.11 shows a non-burst cacheable cycle. The 
cycle becomes a cache fill when the 486 microproc- 
essor samples KEN# active at the end of the first 
clock. The 486 microprocessor drives BLAST # in- 
active in the second clock in response to KEN#. 
BLAST # is driven inactive because a cache fill re- 
quires 3 additional cycles to complete. BLAST # re- 
mains inactive until the last transfer in the cache line 
fill. KEN # must be returned active in the last cycle 


of the transfer for the data to be written into the 


internal cache. 


' Note that this cycle would be a single bus cycle if 


KEN # was not sampled active at the end of the first 
clock. The subsequent three reads would not have 
happened since a cache fill was not requested. 


The BLAST # output is invalid in the first clock of a 
cycle. BLAST # may be active during the first clock 
due to earlier inputs. Ignore BLAST # until the sec- 
ond clock. 


During the first cycle of the cache line fill the exter- 
nal system should treat the byte enables as if they 
are all active. In subsequent cycles in the burst, the 
486 microprocessor drives the address lines and 
byte enables (see Section 7.2.4.2 for Burst and 
Cache Line Fill Order). 
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Figure 7.11. Non-Burst, Cacheable Cycles 


7.2.3.3 Burst Cacheable Cycles | | The external system informs the 486 microproces- 
| nn | sor that it will burst the line in by driving BRDY # 

Figure 7.12 illustrates a burst mode cache fill. As in active at the end of the first cycle in the transfer. 

Figure 7.11, the transfer becomes a cache line fill : 

when the external system returns KEN# active at Note that during a burst cycle ADS# is only driven 

the end of the first clock in the cycle. | _ with the first address. 
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Figure 7.12. Burst Cacheable Cycle 


7.2.3.4 Effect of Changing KEN# during a 
Cache Line Fill 


KEN# can change multiple times as long as it ar- 
rives at its final value in the clock before RDY# or 
_ BRDY # is returned. This is illustrated in Figure 7.13. 
Note that the timing of BLAST# follows that of 
KEN # by one clock. The i486 samples KEN # every 
clock and uses the value returned in the clock be- 
fore ready to determine if a bus cycle would be a 


cache line fill. Similarly, it uses the value of KEN # in 
the last cycle, before early RDY # to load the line 
just retrieved from the memory into the cache. 
KEN# is sampled every clock, it must satisfy setup 
and hold time. 


KEN # can also change multiple times before a burst 
cycle as long as it arrives at its final value one clock 
before ready is returned active. 
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Figure 7.13. Effect of Changing KEN# 


7.2.4 BURST MODE DETAILS | Driving BRDY# and RDY# inactive adds a wait 
state to the transfer. A burst cycle where two clocks 


| . are required for every burst item is shown in Figure 
7.2.4.1 Adding Wait States to Burst Cycles 7.14. i ; os ? 


Burst cycles need not return data on every clock. . 
The 486 microprocessor will only strobe data into 
the chip when either RDY# or BRDY# are active. 
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Figure 7.14. Slow Burst Cycle 


7.2.4.2 Burst and Cache Line Fill Order Table 7.7. Burst Order 


First Second Third Fourth 
Addr. Addr. Addr. Addr. 
in an order determined by the first address in the 


0 4 8 C 

4 0 C 8 

8 C 0 4 

C 8 4 0 
transfer. For example, if the first address was 104 


the next three addresses in. the burst will be 100, An example of burst address sequencing is shown in 
10C and 108. 7 : Figure 7.15. fo : 


The burst order used by the 486 microprocessor is 
shown in Table 7.7. This burst order is followed by 
any burst cycle (cache or not), cache line fill (burst 
or not) or code prefetch. 


The microprocessor presents each request for data 
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 RDY# 


BRDY# 
KENS 
BLAST# 


DATA 


-. The. sequences shown in Table 7.7 accommodate 


- systems with 64-bit busses as well as systems with 


32-bit data busses. The sequence applies to all 
bursts, regardless of whether the purpose of the 
burst is to fill a cache line, do a 64-bit read, or do a 
_pre-fetch. If either BS8# or BS16# is returned ac- 


tive, the 486 microprocessor completes the transfer 


of the current 32-bit word before progressing to the 
next 32-bit word. For example, a BS16# burst to 
address 4 has the following order: 4-6-0-2-C-E-8-A. 


7.2.4.3 Interrupted Burst Cycles 


Some memory systems may not be able to respond 
with burst cycles in the order defined in Table 7.7. 
To support these systems the 486 microprocessor 
allows a burst cycle to be interrupted at any time. 


_ Figure 7.15. Burst Cycle Showing Order of Addresses ~ 
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The 486 microprocessor will automatically generate 
another normal bus cycle after being interrupted to 
complete the data transfer. This is called an inter- 
rupted burst cycle. The external system can respond 
to an interrupted burst cycle with another burst cy- 
cle. e 


The external system can interrupt a burst cycle by 
returning RDY # instead of BRDY#. RDY# can be 
returned after any number of data cycles terminated 
with BRDY #. 


An example of an interrupted burst cycle is shown in 
Figure 7.16. The 486 microprocessor immediately 
drives ADS # active to initiate a new bus cycle after 
RDY # is returned active. BLAST # is driven inactive 
one clock after ADS# begins the second bus cycle 
indicating that the transfer is not complete. 
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Figure 7.16. Interrupted Burst Cycle 


KEN# need not be returned active in the first data 
cycle of the second part of the transfer in Figure 
7.16. The cycle had been converted to a cache fill in 
the first part of the transfer and the 486 microproc- 
essor expects the cache fill to be completed. Note 
that the first half and second half of the transfer in 
Figure 7.16 are each two cycle burst transfers. 


The order in which the 486 microprocessor requests 
operands during an interrupted burst transfer is de- 
termined by Table 7.7. Mixing RDY#. and BRDY # 
does not change the order in which operand ad- 
dresses are requested by the 486 microprocessor. 


An example of the order in which the 486 microproc- 
essor requests operands during a cycle in which the 
external system mixes. RDY # and BRDY # is shown 
in Figure 7.17. The 486 microprocessor initially re- 
quests a transfer beginning at location 104. The 
transfer becomes a cache line fill when the external 
system returns KEN# active. The first cycle of the 
cache fill transfers the contents of location 104 and 
is terminated with RDY #. The 486 microprocessor 
drives out a new request (by asserting ADS#) to 
address 100. If the external system terminates the 
second cycle with BRDY #, the 486 microprocessor 
will next request/expect address 10C. The correct 
order is determined by the first cycle in the transfer, 
which may not be the first cycle in the burst if the 
system mixes RDY # with BRDY #. 
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Figure 7.17. interrupted Burst Cycle with Unobvious Order of Addresses 


7.2.5 8- AND 16-BIT CYCLES 


The 486 microprocessor supports both 16- snd 8- bit 
external busses through the BS16# and BS8¥ in- 
puts. BS16# and BS8# allow the external system to 
specify, on a cycle by cycle basis, whether the ad- 
dressed component can supply 8, 16 or 32 bits. 
BS16# and BS8# can be used in burst cycles as 
well as non-burst cycles. If both BS16# and BS8# 
are returned active for any bus cycle, the 486 micro- 
processor will respond as if only BS8# were active. 


The timing of BS16# and BS8# is the same as that 
of KEN#. BS16#. and BS8# must be driven active 
before the first RDY # or BRDY # is driven active. 


Driving the BS16# and BS8# active can force the 
486 microprocessor to run additional cycles to com- — 
plete what would have been only a single 32-bit cy- 

cle. BS8# and BS16# may change the state of 


BLAST #. when they force suopedueTt cycles from 
the transfer. 


Figure 7. 18 shows an een in. which BS8#4 
forces the 486 microprocessor to run two extra cy- 


cles to complete a transfer. The 486 microprocessor 
issues a request for 24 bits of information. The ex- 


ternal system drives BS8# active indicating that 
only eight bits of data can be supplied per cycle. The 
486 microprocessor issues two extra cycles to com- 
plete the transfer. | 
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Figure 7.18. 8-Bit Bus Size Cycle 


Extra cycles forced by the BS16# and BS8# should 


be viewed as independent bus cycles. BS16# and » 


BS8# should be driven active for each additional 
cycle unless the addressed device has the ability to 
change the number of bytes it can return between 
cycles. The 486 microprocessor will drive BLAST # 
inactive until the last cycle before the transfer is 
complete. 


Refer to Section 7.1.3 for the sequencing of ad- 
_dresses while BS8# or BS16# are active. 
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BS8# and BS16# operate during burst cycles in ex- 
actly the same manner as non-burst cycles. For ex- 
ample, a single non-cacheable read could be trans- 
ferred by the 486 microprocessor as four 8-bit burst 
data cycles. Similarly, a single 32-bit write could be 
written as four 8-bit burst data cycles. An example of 
a burst write is shown in Figure 7.19. Burst writes 
can only occur if BS8# or BS16# is asserted. 
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Figure 7.19. Burst Write as a Result of BS8# or BS16# 


7.2. 6 LOCKED CYCLES 


eevee cycles are generated i in software for any in- 
struction that performs a read-modify-write opera- 
tion. During a read-modify-write operation the proc- 
essor can read and modify a variable in external 
memory and be assured that the variable is not ac- 
cessed between the read and write. 


Locked cycles are automatically generated during 


certain bus transfers. The xchg (exchange) instruc- 
tion generates a locked cycle when one of its oper- - 


ands is memory based. Locked cycles are generat- 
ed when a segment or page table entry is updated 
and during interrupt acknowledge cycles. Locked cy- 
cles are also generated when the LOCK instruction 
prefix is used with selected instructions. 
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Locked cycles are implemented in hardware with the 
LOCK# pin. When LOCK # is active, the processor 
is performing a read-modify-write operation and the 
external bus should not be relinquished until the cy- 
cle is complete. Multiple reads or writes can be 
locked. A locked cycle is shown in Figure 7.20. 
LOCK # goes active with the address and bus defini- 
tion pins at the beginning of the first read cycle and 
remains active until RDY# is returned for the last 
write cycle. For unaligned 32 bits read-modify-write 
operation, the LOCK # remains active for the entire 
duration of the multiple cycle. It will go inactive when 
RDY # is returned for the last write cycle. 
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Figure 7.20. Locked Bus Cycle 


When LOCK # is active, the 486 microprocessor will 
recognize address hold and backoff but will not rec- 
ognize bus hold. It is left to the external system to 
properly arbitrate a central bus when the 486 micro- 
processor generates LOCK #. 


7.2.7 PSEUDO-LOCKED CYCLES 


Pseudo-locked cycles assure that no other master 
will be given control of the bus during operand trans- 
fers which take more than one bus cycle. Examples 
include 64-bit floating point read and writes, 64-bit 
descriptor loads and cache line fills. 


Pseudo-locked transfers are indicated by the 
PLOCK# pin. The memory operands must be 
aligned for correct operation of a pseudo-locked cy- 
cle. 


PLOCK# need not be examined during burst reads. 
A 64-bit aligned operand can be retrieved in one 
burst (note: this is only valid in systems that do not 
interrupt bursts). 


The system: must examine PLOCK# during 64-bit 
writes since the 486 microprocessor cannot burst 
write more than 32 bits. However, burst can be used 
within each 32-bit write cycle if BS8# or BS16# is 
asserted. BLAST will be deasserted in response to 
BS8# or BS16#. A 64-bit write will be driven out as 
two non-burst bus cycles. BLAST # is asserted dur- 
ing both writes since a burst is not possible. 


PLOCK # is asserted during the first write to indicate 
that another write follows. This behavior is shown in 
Figure 7.21. 


The first cycle of a 64-bit floating point write is the 
only case in which both PLOCK# and BLAST # are 
asserted. Normally PLOCK# and BLAST # are the 
inverse of each other. 


During all of the cycles where PLOCK # is asserted, 
HOLD is not acknowledged until the cycle com- 
pletes. This results in a large HOLD latency, espe- 
cially when BS8# or BS16# is asserted. To reduce 
the HOLD latency during these cycles, windows are 
available between transfers to allow HOLD to be ac- 
knowledged during non-cacheable code prefetches. 
PLOCK# will be asserted since BLAST # is negat- 
ed, but it is ignored and HOLD is recognized during 
the prefetch. 


PLOCK# can change several times during a cycle 
settling to its final value in the clock ready is re- 
turned. 


7.2.8 INVALIDATE CYCLES 


Invalidate cycles are needed to keep the 486 micro- 
processor’s internal cache contents consistent with 
external memory. The 486 microprocessor contains 
a mechanism for listening to writes by other devices 
to external memory. When the processor finds a 
write to a Section of external memory contained in 


5-113 


PLOCK# 
RDY# 


BLAST# 


i486™ MICROPROCESSOR 


WRITE 


240440-64 


Figure 7.21. Pseudo Lock Timing _ 


its internal cache, the processor’s internal copy i is 
validates: 


Invalidations use two pins, address hold request 
(AHOLD) and valid external address (EADS#). 
There are two steps in an invalidation cycle. First, 


the external system asserts the AHOLD input forcing. 


the 486 microprocessor to immediately relinquish its 
address bus. Next, the external system asserts 


EADS # indicating that a valid address is on the 486 


_microprocessor’s address bus. EADS# and the in- 
‘validation address, Figure 7-22 shows the fastest 


possible invalidation cycle. The i486 cycle CPU rec- 


ognizes AHOLD on one CLK edge and floats the 
address bus in response. To allow the address bus 
to float and avoid contention, EADS # and the invali- 
dation address should not be driven until the follow- 
ing CLK edge. The microprocessor reads the ad- 
dress over its address lines. If the microprocessor 
finds this address in its internal cache, the cache 
entry is invalidated. Note that the 486 microproces- 
sor’s address bus is input/output unlike the 386 mi- 
croprocessor’s bus, which is output only. 


_ The 486 microprocessor immediately relinquishes its 
address bus in the next clock upon assertion of 
_ AHOLD. For example, the bus could be 3 wait states 
into a read cycle. If AHOLD is activated, the 486 


microprocessor will immediately float its address bus 


' before ready is returned terminating the bus cycle. 


When AHOLD is asserted only the address bus is 
floated, the data bus can remain active. Data can be 
returned for a previously specified bus cycle during 
address hold (see Figures 7.22, 7.23). 


EADS # is normally asserted when an external mas- 
ter drives an address onto the bus. AHOLD need not 
be driven for EADS# to generate an internal invali- 
date. If EADS# alone is asserted while the 486 mi- 
croprocessor is driving the address bus, it is possible 
that the invalidation address will come from the 486 


microprocessor itself. 


Note that it is also possible to run an invalidation 


cycle by asserting EADS # when HOLD or BUFF # is. 
asserted. — 


Running an invalidate cycle prevents the 486 micro- 
processor cache from satisfying other internal re- 
quests, so invalidations should be run only when 
necessary. The fastest possible invalidate cycle is 
shown in Figure 7.22, while a more realistic invalida- 
tion cycle is shown in 7.23. Both of the examples 
take one clock of cache access from the rest of the 
486 microprocessor. 
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Figure 7.23. Typical Internal Cache Invalidation Cycle 
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7.2.8.1 Rate of Invalidate Cycles 


The 486 microprocessor can accept one invalidate 
per clock except in the last clock of a line fill. One 
invalidate per clock is possible as long as EADS # is 
negated in ONE or BOTH of the following cases: 


1. In the clock RDY # or BRDY # is returned for 
the last time. 


— 2. In the clock following RDY # or BRDY # being 
returned for the last time. 


This definition allows two system designs. Simple 
designs can restrict invalidates to one every other 
clock. The simple design need not track bus activity. 
Alternatively, systems can request one invalidate 
per clock provided that the bus is monitored. 


7.2.8.2 Running Invalidate Cycles Concurrently 
with Line Fills 


Precautions are necessary to avoid caching stale 


data in the 486 microprocessor’s cache in a system > 


with a second level cache. An example of a system 
with a second level cache is shown in Figure 7.24. 
An external device can be writing to main memory 
over the system bus while the 486 microprocessor is 
retrieving data from the second level cache. The 486 
microprocessor will need to invalidate a line in its 
internal cache if the external device is writing to a 
main memory address also contained in the 486 mi- 
croprocessor’s cache. 
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A potential problem exists if the external device is 
writing to an address in external memory, and at the 
same time the 486 microprocessor is reading data 
from the same address in the second level cache. 
The system must force an invalidation cycle to invali- 
date the data that the 486 microprocessor has _re- 
quested during the line fill. 


If the system asserts EADS # before the first data in 
the line fill is returned to the 486 microprocessor, the 
system must return data consistent with the new 
data in the external memory upon resumption of the | 
line fill after the invalidation cycle. This is illustrated 
by the asserted EADS # eo labeled 1 in Figure 
7. 25. 
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1. Data returned must be consistent if its address equais the invalidation address in this clock 
2. Data returned will not be cached if its address equals the invalidation address in this clock 


Figure 7.25. Cache Invalidation Cycle Concurrent with Line Fill 


If the system asserts EADS# at the same time or 
after the first data in the line fill is returned (in the 
same clock that the first RDY# or BRDY# is re- 
turned or any subsequent clock in the line fill) the 
daia will be read into the 486 microprocessors input 
buffers but it will not be stored in the on-chip cache. 
This is illustrated by asserted EADS # signal labeled 
2 in Figure 7.25. The stale data will be used to satis- 
fy the request that initiated the cache fill cycle. 


7.2.9 BUS HOLD 


The 486 microprocessor provides a bus hold, hold 
acknowledge protocol using the bus hold request 
(HOLD) and bus hold acknowledge (HLDA) pins. As- 
serting the HOLD input indicates that another bus 
master desires control of the 486 microprocessor’s 
bus. The processor will respond by floating its bus 
and driving HLDA active when the current bus cycle, 
or sequence of locked cycles is complete. An exam- 
ple of a HOLD/HLDA transaction is shown in Figure 
7.26. Unlike the 386 microprocessor, the 486 micro- 


processor can respond to HOLD by floating its bus 
and asserting HLDA while RESET is asserted. 


Note that HOLD will be recognized during un-aligned 
writes (less than or equal to 32-bits) with BLAST # 
being active for each write. For greater than 32-bit or 
un-aligned write, HOLD# recognition is prevented 
by PLOCK# getting asserted. 


The pins floated during bus hold are: BEO#-BE3#, 
PCD, PWT, W/R#, D/C#, M/IO#, LOCK#, | 
PLOCK#, ADS#, BLAST#, DO-D31, A2—A31, 
DPO-DP3. 


7.2.10 INTERRUPT ACKNOWLEDGE 


The 486 microprocessor generates interrupt ac- 
knowledge cycles in response to maskable interrupt 
requests generated on the interrupt request input 
(INTR) pin. Interrupt acknowledge cycles have a 
unique cycle type generated on the cycle type pins. 
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_ Figure 7.26. HOLD/HLDA Cycles 


An example interrupt acknowledge transaction is 
~ shown in Figure 7.27. Interrupt acknowledge cycles 
are generated in locked pairs. Data returned during 
the first cycle is ignored. The interrupt vector is re- 
_.. turned during the second cycle on the lower 8 bits of 

_ the data bus. The 486 microprocessor has 256 pos- © 
sible interrupt vectors. . 


The state of A2 distinguishes the first and second 
interrupt acknowledge cycles. The byte address 
driven during the first interrupt acknowledge cycle is 
4 (A31-A3 low, A2: high, BE3#—BE1# high, and 
BEO# low). The address driven during the second 
interrupt acknowledge cycle is 0 (A31-—A2 low, 
BE3 #-—BE1# high, BEO# low). 


Figure 7.27. Interrupt Acknowledge Cycles - 
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Each of the interrupt acknowledge cycles are termi- 
nated when the external system returns RDY# or 
BRDY #. Wait states can be added by withholding 
RDY# or BRDY#. The 486 microprocessor auto- 
matically generates four idle clocks between the first 
and second cycles to allow for 8259A recovery time. 


7.2.11 SPECIAL BUS CYCLES 


The 486 microprocessor provides four special bus 
cycles to indicate that certain instructions have been 
executed, or certain conditions have occurred inter- 
nally. The special bus cycles in Table 7.8 are defined 
when the bus cycle definition pins are in the follow- 
ing state: M/IO#=0, D/C#=0 and W/R#=$1. 
During these cycles the address bus is driven low 
while the data bus is undefined. 


Two of the special cycles indicate halt or shutdown. 
Another special cycle is generated when the 486 mi- 
croprocessor executes an INVD (invalidate data 
cache) instruction and could be used to flush an ex- 
ternal cache. The Write Back cycle is generated 
when the 486 microprocessor executes the 
WBINVD (write-back invalidate data cache) instruc- 


tion and could be used to synchronize an external | 


write-back cache. 


The external hardware must acknowledge these 


special bus cycles by returning RDY # or BRDY #. 
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Table 7.8. Special Bus Cycle Encoding 


Bus Cycle 
1 1 0 Shutdown 
1 1 1 Flush 
1 0 1 Halt 
0 1 1 Write Back 


7.2.11.1 Halt Indication Cycle 


The i486 Microprocessor halts as a result of execut- 
ing a HALT instruction. Signaling its entrance into 
the halt state, a halt indication cycle is performed. 
The halt indication cycle is identified by the bus defi- 
nition signals in special bus cycle state and a byte 
address of 2. BEO# and BE2# are the only signals 
distinguishing halt indication from shutdown indica- 
tion, which drives an address of 0. During the halt 
cycle undefined data is driven on DO-D31. The halt 
indication cycle must be acknowledged by READY # 
asserted. 


A halted i486 Microprocessor resumes execution 
when INTR (if interrupts are enabled) or NMI or 
RESET is asserted. 
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Figure 7.28. Restarted Read Cycle 
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7.2.11.2 Shutdown Indication Cycle 


The i486 Microprocessor shuts down as a result of a 
protection fault while attempting to process a double 
fault. Signaling its entrance into the shutdown state, 
a shutdown indication cycle is performed. The shut- 
down indication cycle is identified by the bus defini- 
tion signals in special bus cycle state and a byte 
address of 0. 


7.2.12 BUS CYCLE RESTART 


In a multi-master system another bus master may 
require the use of the bus to enable the 486 micro- 
processor to complete its current bus request. In this 
situation the 486 microprocessor will need to restart 
its bus cycle after the other bus master has complet- 
ed its bus transaction. 


A bus cycle may be restarted if the external system 
asserts the backoff (BOFF #) input. The 486 micro- 
processor samples the BOFF # pin every clock. The 
486 microprocessor will immediately (in the next 


clock) float its address, data and status pins when 
BOFF # is asserted (see Figure 7.28). Any bus cycle. 


in progress when BOFF # is asserted is aborted and 
any data returned to the processor is ignored. The 
same pins are floated in response to BOFF# 


Figure 7.29. Restarted Write Cycle 
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as are floated in response to HOLD. HLDA is not 


generated in response to BOFF #. BOFF # has high- 
er priority than RDY # or BRDY #. If either RDY# or 
BRDY # are returned in the same clock as BOFF #, 
BOFF # takes effect. 


The device asserting BOFF # is free to run any cy- 
cles it wants while the 486 microprocessor bus is in 
its high impedance state. If backoff is requested af- 
ter the 486 microprocessor has started a cycle, the 

new master should wait for memory to return RDY# 
or BRDY # before assuming control of the bus. Wait- 
ing for ready provides a handshake to insure that the 
memory system is ready to accept a new cycle. If 
the bus is idle when BOFF# is asserted, the new 


“master can start its cycle two clocks after issuing 
BOFF #. 


The external memory can view BOFF # in the same 
manner as BLAST #. Asserting BOFF # tells the ex- 
ternal memory system that the current cycle is the 
last cycle in a transfer. | 


The bus remains in the high impedance state until 


_ BOFF# is negated. Upon negation, the 486 micro- 


processor restarts its bus cycle by driving out the 
address and status and asserting ADS#. The bus 
cycle then continues as usual. 
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Asserting BOFF # during a burst, BS8# or BS16# 
cycle will force the 486 microprocessor to ignore 
data returned for that cycle only. Data from previous 
cycles will still be valid. For example, if BOFF# is 
asserted on the third BRDY# of a burst, the 486 
microprocessor assumes the data returned with the 
first and second BRDY #’s is correct and restarts 
the burst beginning with the third item. The same 
rule applies to transfers broken into multiple cycle by 
BS8# or BS16#. 
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new bus cycle has begun even-though the cycle was 
aborted. There are two possible solutions to this 
problem. The first is to have all devices recognize 
this condition and ignore ADS# until ready comes 
back. The second approach is to use a “two clock” 
backoff: in the first clock AHOLD is asserted, and in 
the second clock BOFF # is asserted. This guaran- 
tees that ADS# will not be floating low. This is only 
necessary in systems where BOFF # may be assert- 
ed in the same clock as ADS#. 


Asserting BOFF # in the same clock as ADS#¥ will 
cause the 486 microprocessor to float its bus in the 
next clock and leave ADS# floating low. Since 
ADS # is floating low, a peripheral may think that a 


7.2.13 BUS STATES 


A bus state diagram is shown in Figure 7.30. A de- 
scription of the signals used in the diagram is given 
in Table 7.9. 


(RDY# ASSERTED + (BRDY# « BLAST#)ASSERTED) « 
(HOLD + AHOLD + NO REQUEST) « 
BOFF# NEGATED 


REQUEST PENDING « 
(RDY# ASSERTED + (BRDY#« BLAST#)ASSERTED)e 
HOLD NEGATED « 
AHOLD NEGATED 
BOFF# NEGATED © 


REQUEST PENDING 
HOLD NEGATED « 
AHOLD NEGATED « 
BOFF# NEGATED 


BOFF# ie 


k 
BOFF# Ké <> 
OE BOFF # 
ASSERTED os NEGATED 


ie. a 
BOFF# ASSERTED 


AHOLD NEGATED e 


BOFF# NEGATED 
240440-73 


Figure 7.30. Bus State Diagram 


Table 7.9. Bus State ne a 


Bus is idle. Address and status a may be driven to undefined values, or 
the bus may be floated to a high impedance state. 

First clock cycle of a bus cycle. Valid address and status are driven and 
ADS # is asserted. 


Second and subsequent clock cycles of a bus cycle. Data is driven if the 
cycle is a write, or data is expected if the cycle is a read. RDY # and BRDY # 
are sampled. 


First clock cycle of a restarted bus cycle. Valid address and status are driven 
and ADS # is asserted. 


Second and subsequent clock cycles of an aborted bus cycle. 
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7.2.14 FLOATING POINT ERROR HANDLING 


The 486 microprocessor provides two options for re- 
. porting floating point errors. The simplest method is 
to raise interrupt 16 whenever an unmasked floating 
point error occurs. This option may be enabled by 
setting the NE bit in control register 0 (CRO). 


The 486 microprocessor also provides the option of 
allowing external hardware to determine how float- 
ing point errors are reported. This option is neces- 
sary for compatibility with the error reporting scheme 
used in DOS based systems. The NE bit must. be 
cleared in CRO to enable user-defined error report- 
ing. User-defined error reporting is the default condi- 
tion because the NE bit is cleared on reset. 


Two pins, floating point error (FERR#) and ignore 
numeric error (IGNNE#), are provided to direct the 
actions of hardware if user-defined error reporting is 
used. The 486 microprocessor asserts the FERR # 
output to indicate that a floating point error has oc- 
curred. FERR# corresponds to the ERROR # pin.on 
the 387 math coprocessor. However, there is a dif- 
ference in the behavior of the two. 


In some cases FERR#¥ ‘is asserted when the next 


floating point instruction is encountered and in other 
cases it is asserted before the next floating point 
instruction is encountered depending upon the exe- 
cution state of the instruction causing the exception. 


The following class of floating point exceptions drive 
FERR # at the time the exception occurs (i.e., before 
encountering the next floating point instruction). 


1. The stack fault, invalid operation, and denormal 


exceptions on all transcendental instructions, in- . 


teger arithmetic instructions, FSQRT, FSEALE, 
FPREM(1), FXTRACT, FBLD, and FBSTP. 


2. Any exceptions on store instructions (including 
integer store instructions). 


The following class of floating point exceptions drive 
FERR# only after encountering the next noating 


point instruction. 


1. Exceptions other than on all transcendental in- 
structions, integer arithmetic instructions, 
FSQRT, FSCALE, eae e FXTRACT, FBLD, 
_ and FBSTP. 


2. Any exception on all basic arithmetic, load; com- | 


pare, and control instructions (i.e., all other in- 
structions). 


For both sets of exceptions above, the 387 Math 


Coprocessor asserts ERROR# when the error oc- - 


curs and does not wait for the next floating point 
_ instruction to be encountered. 


IGNNE# is an input to the 486 microprocessor. 
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When the NE bit in CRO is cleared, and IGNNE # is 
asserted, the 486 microprocessor will ignore a user 


floating point error and continue executing floating 
point instructions. When IGNNE# is negated, the 


486 microprocessor will freeze on floating point in- 
structions which get errors (except for the control 
instructions FNCLEX, FNINIT, FNSAVE, FNSTENV, 


-FNSTCW, FNSTSW, FNSTSW AX, FNENI, FNDISI 


and FNSETPM). IGNNE# may be asyoenroncus | to 
the 486 clock. — 


An systems with user-defined error reporting, the 


FERR# pin.is connected to the interrupt controller. 
When an unmasked floating point error occurs, an 
interrupt is raised. If IGNNE# is high at the time of 
this interrupt, the 486 microprocessor wili freeze 
(disallowing execution of a subsequent floating point 
instruction) until the interrupt handler is invoked. By 
driving the IGNNE # pin low (when clearing the inter- 
rupt request), the interrupt handler can allow execu- 
tion.of a floating point instruction, within the interrupt 
handler, before the error condition is cleared (by 
FNCLEX, FNINIT, FNSAVE or FNSTENV). If execu- 
tion of a non-control floating point instruction, within 
the floating point interrupt handler, is not needed, 
the IGNNE# pin can be tied HIGH. 


8.0 TESTABILITY 


Testing in the 486 microprocessor can be divided 
into two categories: Built-in Self Test (BIST) and ex- 


_ ternal testing. The BIST tests the non-random logic, 


control ROM (CROM), translation lookaside buffer 
(TLB) and on-chip cache memory. External tests can 
be run on the TLB and the on-chip cache. The 486 
microprocessor also has a test mode in which all 
outputs are tristated. 


8.1 Built-In Self Test (BIST) 


The BIST is initiated by hoiding the AHOLD (address 
hold) pin HIGH for 2 CLKs before and 2 CLKs after 
RESET going from HIGH to LOW as shown in Figure 
6.3. The BIST takes approximately 2**20 clocks, or | 
approximately 42 milliseconds with a 25 MHz 486 
microprocessor. No bus cycles will be run by the 486 
microprocessor until the BIST is concluded. Note 
that for i486 the RESET must be active for 15 clocks 
with or without BIST being enabled for warm resets. 


The results of BIST is stored in the EAX register. 
The 486 microprocessor has successfully passed 
the BIST if the contents of the EAX register are zero. 
if the results in EAX are not zero then the BIST has 
detected a flaw in the microprocessor. The micro- 
orocessor performs reset and begins normal epee: 
tion at the dU aa of the BIST. 
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The non-random logic, control ROM, on-chip cache 
and translation lookaside buffer (TLB) are tested 
during the BIST. 


The cache portion of the BIST verifies that the 
~ cache is functional and that it is possible to read and 
write to the cache. The BIST manipulates test regis- 
ters TR3, TR4 and TR5 while testing the cache. 
These test registers are described in Section 8.2. 


The cache testing algorithm writes a value to each 
cache entry, reads the value back, and checks that 
the correct value was read back. The algorithm may 
be repeated more than once for each of the 512 
cache entries using different constants. 


The TLB portion of the BIST verifies that the TLB is 
functional and that it is possible to read and write to 
the TLB. The BIST manipulates test registers TR6 
and TR7 while testing the TLB. TR6 and TR7 are 
described in Section 8.3. 


8.2 On-Chip Cache Testing 


The on-chip cache testability hooks are designed to 
be accessible during the BIST and for assembly lan- 
guage testing of the cache. 


TR3 
. . DATA Cache Data 
. ; Test Register 


Tag 


1110 9 8 7 
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The 486 microprocessor contains a cache fill buffer 
and a cache read buffer. For testability writes, data 
must be written to the cache fill buffer before it can 
be written to a location in the cache. Data must be 
read from a cache location into the cache read buff- 
er before the microprocessor can access the data. 
The cache fill and cache read buffer are both 128 
bits wide. 


8.2.1 CACHE TESTING REGISTERS TR3, TR4 
AND TR5 i. | 


Figure 8.1 shows the three cache testing registers: 
the Cache Data Test Register (TR3), the Cache 
Status Test Register (TR4) and the Cache Control 
Test Register (TR5). External access to these regis- 
ters is provided through MOV reg, TREG and MOV 
TREG, reg instructions. 


0 


6 54 3 2 #1 +90 


LRU Bits Valid Bits 
Valiq (used only (used only 
during reads); during reads) 


TR5 
Set Select Entry Control |Cache Control 
Select Test Register | 


Figure 8.1. Cache Test Registers 
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Cache Data Test Register: TR3 
The cache fil buffer and the cache read buffer can 


only be accessed through TR3. Data to be written to 
the cache fill buffer must first be written to TR3. Data 


read from the cache read buffer must be loaded into 


TRS. 


TR3 is 32 bits wide while the cache fill and read 
buffers are 128 bits wide. 32 bits of data must be 
written to TR3 four times to fill the cache fill buffer. 
32 bits of data must be read from TR3 four times to 
empty the cache read buffer. The entry select bits in 
TR5 determine which 32 bits of data TR3 will access 
in the buffers. | 


Cache Status Test Register: TR4 


TR4 handles tag, LRU and valid bit information dur- 
ing cache tests. TR4 must be loaded with a tag and 
a valid bit before a write to the cache. After a read 
from a cache entry, TR4 contains the tag and valid 


bit from that entry, and the LRU bits and four valid 


bits from the accessed set. 

Cache Control Test Register: TRS 
TR5 specifies which testability operation will be per- 
formed and the set. and entry within the set which 
will be accessed. 


The seven bit set select field determines which of 
— the 128 sets will be accessed. 


The functionality of the two entry select bits depend 
on the state of the control bits. When the fill or read 
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buffers are being accessed, the entry select bits 
point to the 32-bit location in the buffer being ac- 
cessed. When a cache location is specified, the en- 
try select bits point to one of the four entries in a set. 
Refer to Table 8.1. 


Five testability functions can be performed on the 
cache. The two control bits in TR5 specify the oper- 
ation to be executed. The five operations are: 


1. Write cache fill buffer 

2. Perform a cache testability write 
3. Perform a cache testability read 
4. Read the cache read buffer 

5. Perform a cache flush 


Table 8.1 shows the encoding of the two control bits 


in TR5 for the cache testability functions. Table 8.1 


also shows the functionality of the entry and set se- 
lect bits for each control operation. 


The cache tests attempt to use as much of the nor- 
mal operating circuitry as possible.. Therefore when 
cache tests are being performed, the cache must be 
disabled (the CD and NW bits in control register 
must be set to 1 to disable the cache. See Sectio 
5). : , 


8.2.2 CACHE TESTABILITY WRITE 


A testability write to the cache is a two step process. 
First the cache fill buffer must be loaded with 128 
bits of data and TR4 loaded with the tag and valid 
bit. Next the contents of the fill buffer are written to a 
cache location. Sample assembly code to do a write 
is given in Figure 8.2. 


Table 8.1. Cache Control Bit Encoding and Effect of 
Control Bits on Entry Select and Set Select Functionality 


Perform Flush Cache 


/ 


Control Bits Operation Entry Select Bits Set Select Bits 

[oes [ono —. 

| Enable { Fill Buffer Write Select 32-bit location in fill/read 
7 Read Buffer Read | buffer 


Perform Cache Write Select an entry in set. 1 Select a set to write to 
1 Oss Perform Cache Read Select an entry inset. Select a set to read from 
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Sample Assembly Code 


An example assembly language sequence to perform a cache write is: 
3; @ax. ebx. ecx. edx contain the cache line to write 
; edi contains the tag information to load . 
; CRO already says to enable reads/write to TR5 


; fill the cache buffer 


mov esi,O ; set up command 

mov tr5,esi ; load to TR5 

mov tr3,eax ; load data into cache fill buffer 
mov esi,4 


mov tr5,esi 
mov tr3,ebx 
mov esi,8 

mov tr5,esi 
mov tr3,ecx 
mov esi,Och 
mov tr5,esi 
mov tr3,edx 


load the Cache Status Register 


we we we 


mov tr4,edi ; load 21-bit tag and valid bit 


perform the cache write 


we we we 


mov esi,l 
mov tr5,esi ; write the cache (Set 0, entry 0) 


An example assembly language sequence to perform a cache read is: 
data into eax, ebx, ecx, edx; status into edi 


read the cache line back 


we we We we we 


mov esi,2 
mov tr5,esi ; do cache testability read (set 0, entry 0) 


read the data from the read buffer 


we we we 


mov esi,O 

mov tr5,esi 
mov eax,tr3 
mov esi,4 

mov tr5,esi 
mov ebx,tr3 
mov esi,8 

mov tr5,esi 
mov ecx,tr3 
mov esi,Och 
mov tr5,esi 
mov edx,tr3 


read the status from TR4 


we we we 


mov edi,tr4 


Figure 8.2 Sample Assembly Code for Cache Testing 
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- Loading the fill buffer is accomplished by first writing 
to the entry select bits in TR5 and setting the control 
bits in TR5 to 00. The entry select bits identify one of 
four 32-bit locations in the cache fill buffer to put 32 
-bits of data. Following the write to TR5,.TR3 is writ- 
ten with 32 bits of data which are immediately 


placed in the cache fill buffer. Writing to TR initiates. 


the write to the cache fill buffer. The cache fill buffer 
is loaded with 128 bits of data by writing to TR5 and 
TR3 four times using a different entry select location 
each time. 


TR4 must be loaded with the 21-bit tag and valid bit 
(bit 10 in TR4) before the contents of the fill buffer 
are written to a cache location. 


The contents of the cache fill buffer are written to a 
cache location by writing TR5 with a control field of 
01 along with the set select and entry select fields. 
The set select and entry select field indicate the lo- 
cation in the cache to be written. The normal cache 
LRU update circuitry updates the internal LRU bits 
for the selected set. | 


Note that a cache testability write can only be done 
when the cache is disabled for replaces (the CD bit 
is control register 0 is reset to 1). Also note that care 
must be taken when directly writing to entries in the 
cache. If the entry is set to overlap an area of mem- 
ory that is being used in external memory, that 
cache entry could inadvertently be used instead of 
' the external memory. Of course, this is exactly the 
type of operation that one would desire if the cache 
~ were to be used as a high speed RAM. 


8.2.3 CACHE TESTABILITY READ 


A cache testability read is a two step process. First 
~ the contents of the cache location are read into the 


cache read buffer. Next the data is examined by — 


reading it out of the read buffer. Sample assembly 
code to do a testability read is given in Figure 8.2. 


Reading the contents of a cache location into the 
cache read buffer is initiated by writing TR5 with the 
control bits set to 10 and the desired seven-bit set 
select and two-bit entry select. In response to the 
write to TR5, TR4 is loaded with the 21-bit tag field 
and the single valid bit from the cache entry read. 
TR4 is also loaded with the three LRU bits and four 
valid bits corresponding to the cache set that was 
accessed. The cache read buffer is filled with the 
128-bit value which was found in the data array at 
the specified location. 


~The contents of the read buffer are examined by 
_ performing four reads of TR3. Before reading TR3 
the entry select bits in TR5 must loaded to indicate 
which of the four 32-bit words in the read buffer to 
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transfer into TR3 and the control bits in TR5 must be 
loaded with 00. The register read of TR3-will initiate 
the transfer of the 32-bit value from the read buffer 
to. the specified general purpose register. | 


Note that it is very important that the entire 128-bit . 
quantity from the read buffer and also the informa- 


‘tion from TR4 be read before any memory refer- 


ences are allowed to occur. If memory operations 
are allowed to happen, the contents of the read buff- 
er will be corrupted. This is because the testability 
operations use hardware that is used in normal 
memory accesses for the 486 microprocessor 
whether the cache is enabled or not. 


8.2.4 FLUSH CACHE 


The control bits in TR5 must be written with 11 to 
flush the cache. None of the other bits in TR5 have 
any meaning when 11 is written to the control bits. 
Flushing the cache will reset the LRU bits and the 
valid bits to 0, but will not change the cache tag or 
data arrays. | 


| When the cache is flushed by writing to TR5 the 


special bus cycle indicating a cache flush to the ex- 
ternal system is not run (see Section 7.2.11, Special 
Bus Cycles). The cache should be flushed with the 
instruction INVD (invalidate Data Cache) instruction 
or the WBINVD (Write- -back and invalidate Data 
Cache) instruction. 


8.3 Translation Lookaside Buffer 
(TLB) Testing 


The 486 microprocessor TLB testability hooks are 
similar to those in the 386 microprocessor. The test- 
ability hooks have been enhanced to provide added 
test features and to include new features in the 486 
microprocessor. The TLB testability hooks are de- 
signed to be accessible during the BIST and for as- 
sembly language testing of the TLB. 


8.3.1 TRANSLATION LOOKASIDE BUFFER 
ORGANIZATION — 


The 486 microprocessors TLB is 4-way set associa- 


tive and has space for 32 entries. The TLB is logical- 
ly split into three blocks shown in Figure 8.3. 


The data block is physically split into four arrays, 
each with space for eight entries. An entry in the 
data block is 22 bits wide containing a 20-bit physi- 
cal address and two bits for the page attributes. The 
page attributes are the PCD (page cache disable) bit 
and the PWT (page write-through) bit. Refer to Sec- 
tion 4.5.4 for a discussion of the PCD and PWT bits. 
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Tag Page Physical Page 
17 Bits Protection Address Attributes 
Bits 20 Bits 2 Bits 8 Entries 
4 Bits 


Entries 


240440-43 


Figure 8.3. TLB Organization 


The. tag block is also split into four arrays, one for TLB is the same as used by the on-chip cache. For a 
each of the data arrays. A tag entry is 21 bits wide —_ description of this algorithm refer to Section 5.5. 
containing a 17-bit linear address. and four protec- 


tion bits. The protection bits are valid (V), user/su- | 
pervisor (U/S), read/write (R/W) and dirty (D). - 8.3.2 TLB TEST REGISTERS TR6 AND TR7 


The two TLB test registers are shown in Figure 8.4. 
TR6 is the command test register and TR7 is the 
data test register. External access to these registers 
is provided through MOV reg,TREG and MOV 
TREG,reg instructions. 


The third block contains eight three bit quantities 
used in the pseudo least recently used (LRU) re- 
placement algorithm. These bits are called the LRU 
bits. The LRU replacement algorithm used in the 


1211 10 9 8 7 6 5 4 


TR6 
Linear Address Option |TLB Command 
* |Test Register 


1211109876 5 48321 


Physical Address 


Replacement Pointer Select (Writes) Replacement Pointer (Writes) 
= unused Hit Indication (Lookup) Hit Location (Lookup) 


Figure 8.4. TLB Test Registers 
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Command Test Register: TR6 


TR6 contains the tag information and control infor- 


mation used in a TLB test. Loading TR6 with tag and 
control information initiates a TLB write or lookup 
test. f | . 


TR6 contains three bit fields, a 20-bit linear address 
(bits 12-31), seven bits for the TLB tag protection 
bits (bits 5-11) and one bit (bit 0) to define the type 
of operation to be performed on the TLB. 


The 20-bit linear address forms the tag information 
used in the TLB access. The lower three bits of the 
linear address select which of the eight sets are ac- 
cessed. The upper 17 bits of the linear address form 
the tag stored in the tag array. | 


The seven TLB tag protection bits are described be- 
low. 


V: The valid bit for this TLB entry 

D,D#: The dirty bit for/from the TLB entry 

U,U#: The user/supervisor bit for/from the TLB 
entry 

W,W#: The read/write bit for/from the TLB entry 


Two bits are used to represent the D, U/S and R/W 
bits in the TLB tag to permit the option of a forced 
miss or hit during a TLB lookup operation. The 
forced miss or hit will occur regardless of the state 
of the actual bit in the TLB. The meaning of these 


- pairs of bits is given in Table 8.2. 


The operation bit in TR6 determines if the TLB test 
operation will be a write or a lookup. The function of 
the operation bit is given in Table 8.3. 


Table 8.3. TR6 Operation Bit Encoding 
| TR6 TLB Operation 
Bit 0 to Be Performed : 
0 | TLB Write | 
1 ~  TLBLookup — 


Data Test Register: TR7 


TR7 contains the information stored or read from the 
data block during a TLB test operation. Before a TLB 
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test write, TR7 contains the physical address and 
the page attribute bits to be stored in the entry. After 
a TLB test lookup hit, TR7 contains the physical ad- 
dress, page attributes, LRU bits and entry location 
from the access. : 


TR7 contains a.20-bit physical address (bits 12-31), 
two bits for PCD (bit 11) and PWT (bit 10) and three 
bits for the LRU bits (bits 7-9). The LRU bits in TR7 
are only used during a TLB lookup test. The func- 
tionality of TR7 bit 4 differs for TLB writes and look- 
ups. The encoding of bit 4 is defined in Tables 8.4 
and 8.5. Finally TR7 contains two bits (bits 2-3) to 
specify a TLB replacement pointer or the location of 
a TLB hit. 


Table 8.4. Encoding of Bit 4 of TR7 on Writes 


Pseudo-LRU Replacement Pointer 


Data Test Register Bits 3:2 


Meaning after TLB 
i | Lookup Operation i 
07 TLB Lookup Resulted ina Miss _ 
= TLB Lookup Resulted in a Hit | 
A replacement pointer is used during a TLB write. 
The pointer indicates which of the four entries in an 
accessed set is to be written. The replacement 
pointer can be specified.to be the internal LRU bits 
or bits 2-3 in TR7. The source of the replacement 


pointer is specified by TR7 bit 4. The encoding of bit 
4 during a write is given by Table 8.4. 


Note that both ‘testability writes and lookups affect 
the state of the internal LRU bits regardless of the 
replacement pointer used. Ail TLB write operations 


(testability or normal operation) cause the written 


entry to become the most recently used. For exam- 
ple, during a testability write with the replacement 
pointer specified by TR7 bits 2-3, the indicated en- 
try is written and that entry becomes the most re- 
cently used as specified by the internal LRU bits. 


Table 8.2. Meaning of a Pair of TR6 Protection Bits 


TR6 Protection Bit | TR6 Protection Bit# 
(B) | (B#) _ 
Oo ? O 


— © — 


Meaning on Meaning on | 
TLB Write Operation TLB Lookup Operation 


Undefined 

Write 0 to TLB TAG Bit B 

Write 1 to TLB TAG Bit B 
Undefined 


Miss any TLB TAG BitB | 
Match TLB TAG Bit B if 0 
Match TLB TAG Bit B if 1 
Match any TLB TAG Bit B 
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There are two TLB testing operations: write entries 
into the TLB, and perform TLB lookups. One major 
enhancement over TLB testing in the 386 micro- 
processor is that paging need not be disabled while 
executing testability writes or lookups. 


Note that any time one TLB set contains the same 
linear address in more than one of its entries, look- 
ing up that linear address will not result in a hit. 
Therefore a single linear address should not be writ- 
ten to one TLB set more than once. 


8.3.3 TLB WRITE TEST 


To perform a TLB write TR7 must be loaded fol- 
lowed by a TR6 load. The register operations must 
be performed in this order since the TLB operation is 
triggered by the write to TR6. 


TR7 is loaded with a 20-bit physical address and 
values for PCD and PWT to be written to the data 
portion of the TLB. In addition, bit 4 of TR7 must be 
loaded to indicate whether to use TR7 bits 3-2 or the 
internal LRU bits as the replacement pointer on the 
TLB write operation. Note that the LRU bits in TR7 
are not used in a write test. 


TR6 must be written to initiate the TLB write opera- 
tion. Bit 0 in TR6 must be reset to zero to indicate a 
TLB write. The 20-bit linear address and the seven 
page protection bits must also be written in TR6 to 
specify the tag portion of the TLB entry. Note that 
the three least significant bits of the linear address 
specify which of the eight sets in the data block will 
be loaded with the physical address data. Thus only 
17 of the linear address bits are stored in the tag 
array. 


8.3.4 TLB LOOKUP TEST 


To perform a TLB lookup it is only necessary to write 
_ the proper tags and control information into TR6. Bit 
0 in TR6 must be set to 1 to indicate a TLB lookup. 
TR6 must be loaded with a 20-bit linear address and 
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the seven protection bits. To force misses and 
matches of the individual protection bits on TLB 
lookups, set the seven protection bits as specified in 
Table 8.2. : 


A TLB lookup operation is initiated by the write to 
TR6. TR7 will indicate the result of the lookup opera- 
tion following the write to TR6. The hit/miss indica- 
tion can be found in TR7 bit 4 (see Table 8.5). 


TR7 will contain the following information if bit 4 indi- 


cated that the lookup test resulted in a hit. Bits 2-3 
will indicate in which set the match occurred. The 22 
most significant bits in TR7 will contain the physical 
address and page attributes contained in the entry. 
Bits 9-7 will contain the LRU bits associated with 
the accessed set. The state of the LRU bits is previ- 
ous to their being updated for the current lookup. 


If bit 4 in TR7 indicated that the lookup test resulted 
in a miss the remaining bits in TR7 are undefined. 


Again it should be noted that a TLB testability lookup 
operation affects the state of the LRU bits. The LRU 
bits will be updated if a hit occurred. The entry which 
was hit will become the most recently used. 


8.4 Tristate Output Test Mode 


The 486 microprocessor provides the ability to float 
all its outputs and bidirectional pins. This includes all 
pins floated during bus hold as well as pins which 
are never floated in normal operation of the chip 
(HLDA, BREQ, FERR# and PCHK#). When the 486 
microprocessor is in the tri-state output test mode 
external testing can be used to test board connec- 
tions. 


The tri-state test mode is invoked by driving 
FLUSH # low for 2 clocks before and 2 clocks after 
RESET going low. The outputs are guaranteed to tri- 
state no later than 10 clocks after RESET goes low 
(see Figure 6.4). The 486 microprocessor remains in 
the tristate test mode until the next RESET. 
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9.0 DEBUGGING SUPPORT 


The 486 Microprocessor provides several features 
which simplify the debugging process. The three cat- 
egories of on-chip debugging aids are: 


1) the code execution breakpoint opcode (OCCH), — 


2) the single-step capability provided by the TF. bit 
in the flag register, and 


3) the code and data breakpoint capability provided 
by the Debug Registers DRO-3, DR6, and DR7. 


9.1 Breakpoint Instruction 


A single-byte-opcode breakpoint instruction is avail- 
able for use by software debuggers. The breakpoint 
opcode is OCCH, and generates an exception 3 trap 
when executed. In typical use, a debugger program 
can “plant” the breakpoint instruction at all desired 
code execution breakpoints. The single-byte break- 
point opcode is an alias for the two-byte general 
software interrupt instruction, INT n, where n=3. 
The only difference between INT 3 (OCCh) and INT n 
is that INT 3 is never IOPL-sensitive but INT n is 


lOPL-sensitive in Protected Mode and Virtual 8086 


Mode.. 


9.2 Single-Step Trap | 


| ' If the single-step flag (TF, bit 8) in the EFLAG regis- 


ter is found to be set at the end of an instruction, a 
single-step exception occurs. The single-step ex- 
ception is auto vectored to exception number 1. Pre- 
cisely, exception 1 occurs as a trap after the instruc- 


tion following the instruction which set TF. In typical — 


_ practice, a debugger sets the TF bit of a flag register 
image on the debugger’s stack. It then typically 
transfers control to the user program and loads the 
flag image with a signal instruction, the IRET instruc- 
tion. The single-step trap occurs after executing one 
instruction of the user program. 


Since the exception 1 occurs as a trap (that is, it 
occurs after the instruction has already executed), 
the CS:EIP pushed onto the debugger’s stack points 
to the next unexecuted instruction of the program 
being debugged. An exception 1 handler, merely by 
ending with an IRET instruction, can therefore effi- 
ciently support single-stepping through a user pro- 
gram. 


9.3 Debug Registers 


The Debug Registers are an advanced: debugging 
feature of the 486 Microprocessor. They allow data 
access breakpoints as: well as code execution 


breakpoints. Since the breakpoints are indicated by 


on-chip: registers, an instruction execution break- 
point can be placed in ROM code or in code shared 


by several tasks, neither of which can be suPpones 


by the INT3 breakpoint opcode. 


The 486 Microprocessor contains six Debug Regis- 
ters, providing the ability to specify up to four distinct 
breakpoints addresses, breakpoint control options, 
and read breakpoint status. Initially after reset, 
breakpoints are in the disabled state. Therefore, no 
breakpoints will occur unless the debug registers are 
programmed. Breakpoints set up in the Debug Reg- 
isters are autovectored to exception number 1. 


9.3.1 LINEAR ADDRESS BREAKPOINT 
REGISTERS (DRO-DR3) 


Up to four breakpoint addresses can be specified by 
writing into Debug Registers DRO-DR3, shown in. 
Figure 9.1. The breakpoint addresses specified are 
32-bit linear addresses. 486 Microprocessor hard- 
ware continuously compares the linear breakpoint 
addresses in DRO-DR3 with the linear addresses 
generated by executing software (a linear address is 
the result of computing the effective address and 
adding the 32-bit segment base address). Note that 
if paging is not enabled the linear address equals the 
physical address. If paging is enabled, the linear ad- 
dress is translated to a physical 32-bit address by 
the on-chip paging unit. Regardless of whether pag- 
ing is enabled or not, however, the breakpoint regis- 
ters hold linear addresses. 


9.3.2 DEBUG CONTROL REGISTER (DR7) 


A Debug Control Register, DR7 shown in Figure 9.1, 
allows several debug control functions such as en- 
abling the breakpoints and setting up other control 
options for the breakpoints. The fields within the De- 
bug Control Register, DR7, are as follows: 


LENi (breakpoint length specification bits) 


A 2-bit LEN field exists for each of the four break- 
points. LEN specifies the length of the associated 
breakpoint field. The choices for data breakpoints 
are: 1 byte, 2 bytes, and 4 bytes. Instruction execu- 
tion breakpoints must have a length of 1 (LENi = 
00). Encoding of the LENi field is as follows: 
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BREAKPOINT 0 LINEAR ADDRESS 


BREAKPOINT 1 LINEAR ADDRESS 
BREAKPOINT 2 LINEAR ADDRESS 
BREAKPOINT 3 LINEAR ADDRESS 
Intel reserved. Do not define. 


Intel reserved. Do not define. 


B 
T 


Figure 9.1. Debug Registers 


Usage of Least 
Significant Bits in 
Breakpoint Address 
Register i, (i= 0—3) 


1 byte All 32-bits used to 
specify a single-byte 
breakpoint field. 


aa = 


specify a two-byte, 
ore | 


word-aligned 
breakpoint field. AO in 
do not use 
ak: i 


Breakpoint Address 
ui 


Register is not used. 

The LENi field controls the size of breakpoint field i 
by controlling whether all low-order linear address 
bits in the breakpoint address register are used to 
detect the breakpoint event. Therefore, all break- 
point fields are aligned; 2-byte breakpoint fields be- 
gin on Word boundaries, and 4-byte breakpoint 
fields begin on Dword boundaries. 


Breakpoint 
Encoding | Field Width 


A2-A31 used to 
specify a four-byte, 
dword-aligned 
breakpoint field. AO 
and A1 in Breakpoint 
Address Register are 
not used. 


The following is an example of various size break- 
point fields. Assume the breakpoint linear address in 
DR2 is OOOOOOOSH. In that situation, the following 
illustration indicates the region of the breakpoint 
field for lengths of 1, 2, or 4 bytes. 


DR2=00000005H; LEN2 = 00B 
0 


00000008H 
00000004H 
00000000H 


me 
hae 


DR2=00000005H; LEN2 = 01B 
0 


00000008H 
00000004H 
00000000H 


00000008H 
00000004H 


bkpt fld2 
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RWi (memory access qualifier bits) 


A 2-bit RW field exists for each of the four break- 


points. The 2-bit RW field specifies the type of usage 
which must occur in order to activate the associated 
breakpoint. : 


RW - Usage 
Encoding Causing Breakpoint 


00 Instruction execution only 
01 Data writes only : 
10 | Undefined—do not use this encoding 
11 | Data reads and writes only 


RW encoding 00 is used to set up an instruction 
execution breakpoint. RW encodings 01 or 11 are 
used to set up write-only or read/write data break- 
points. 


Note that instruction execution breakpoints are 
taken as faults (i.e., before the instruction exe- 
cutes), but data breakpoints are taken as traps 
(i.e., after the data transfer takes place). 


Using LENi and RWi to Set Data Breakpoint i 


A data breakpoint can be set up by writing the linear 
address into DRi (i = O-3). For data breakpoints, 
RWi can = 01 (write-only) or 11 (write/read). LEN 
can = 00, 01, or 11. 


lf a data access entirely or partly falls within the data 
breakpoint field, the data breakpoint condition has 


occurred, and if the breakpoint is enabled, an excep- — 


tion 1 trap will occur. 


Using LENi and RWi to Set Instruction Execution 
Breakpoint | 


An instruction execution breakpoint can be set up by 
writing address of the beginning of the instruction 
(including prefixes if any) into DRi (i = 0-3). RWi 
must = 00 and LEN must = 00 for instruction exe- 
cution breakpoints. ng 


If the instruction beginning at the breakpoint address 
is about to be executed, the instruction execution 
breakpoint condition has occurred, and if the break- 
point is enabled, an exception 1 fault will occur be- 
fore the instruction is executed. 


Note that an instruction execution breakpoint ad- 
dress must be equal to the beginning byte address 
of an instruction (including prefixes) in order for the 
instruction execution breakpoint to occur. 


GD (Global Debug Register access detect) 
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_ The Debug Registers can only be accessed in Real 


Mode or at privilege level 0 in Protected Mode. The 
GD bit, when set, provides extra protection against 
any Debug Register access even in Real Mode or at 
privilege level 0 in Protected Mode. This additional 
protection feature is provided to guarantee that a 
software debugger can have full control over the De- 
bug Register resources when required. The GD bit, 


when set, causes an exception 1 fault if an instruc- 


tion attempts to read or write any Debug Register. 
The GD bit is then automatically cleared when the 
exception 1 handler is invoked, allowing the excep- 


tion 1 handler free access to the debug registers. 


GE and LE (Exact data breakpoint match, global and 
local) 


The breakpoint mechanism of the 486 Microproces- 
sor differs from that of the 386. The 486 Microproc- 
essor always does exact data breakpoint matching, 
regardless of GE/LE bit settings. Any data break- 
point trap will be reported exactly after completion of 
the instruction that caused the operand transfer. Ex- 
act reporting is provided by forcing the 486 Micro- 
processor execution unit to wait for completion of 
data operand transfers before beginning execution 
of the next instruction. 


When the 486 Microprocessor performs a_ task 
switch, the LE bit is cleared. Thus, the LE bit sup- 
ports fast task switching out of tasks, that have 
enabled the exact data breakpoint match for their 
task-local breakpoints. The LE bit is cleared by the 
processor during a task switch, to avoid having ex- 
act data breakpoint match enabled in the new task. 
Note that exact data breakpoint match must be re- 
enabled under software control. 


The 486 Microprocessor GE bit is unaffected during 
a task switch. The GE bit supports exact data break- 
point match that is to remain enabled during all tasks 
executing in the system. 


Note that instruction execution ene are al- 
ways reported ae: 


Gi and Li (breakpoint enable, global and local) 


If either Gi or Li is set then the associated breakpoint 
(as defined by the linear address in DRi, the length 
in LENi and the usage criteria in RWi) is enabled. If 
either Gi or Li is set, and the 486 Microprocessor 
detects the ith breakpoint condition, then the ‘excep: 
tion 1 handler is invoked. 


When the 486 Microprocessor performs a_ task 
switch to a new Task State Segment (TSS), all Li 
bits are cleared. Thus, the Li bits support fast task 
switching out of tasks that use some task-local 
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breakpoint registers. The Li bits are cleared by the 
processor during a task switch, to avoid spurious ex- 
ceptions in the new task. Note that the breakpoints 
must be re-enabled under software control. 


All 486 Microprocessor Gi bits are unaffected during 
a task switch. The Gi bits support breakpoints that 
are active in all tasks executing in the system. 


9.3.3 DEBUG STATUS REGISTER (DR6) 


A Debug Status Register, DR6 shown in Figure 9.1, 
allows the exception 1 handler to easily determine 
why it was invoked. Note the exception 1 handler 
can be invoked as a result of one of several events: 


1) DRO Breakpoint fault/trap. 
2) DR1 Breakpoint fault/trap. 
3) DR2 Breakpoint fault/trap. 
4) DR3 Breakpoint fault/trap. 
5) Single-step (TF) trap. 

6) Task switch trap. 


7) Fault due to attempted debug register access 
when GD= 1. 


The Debug Status Register contains single-bit flags 
for each of the possible events invoking exception 1. 
Note below that some of these events are faults (ex- 
ception taken before the instruction is executed), 
while other events are traps (exception taken after 
the debug events occurred). 


The flags in DR6 are set by the hardware but never 
cleared by hardware. Exception 1 handler software 
should clear DR6 before returning to the user pro- 
gram to avoid future confusion in identifying the 
source of exception 1. 


The fields within the Debug Status Register, DR6, 
are as follows: 


Bi (debug fault/trap due to breakpoint 0-3) 


Four breakpoint indicator flags, BO-B3, correspond 
one-to-one with the. breakpoint registers in DRO- 
DR3. A flag Bi is set when the condition described 
by DRi, LENi, and RWi occurs. 


If Gi or Li is set, and if the ith breakpoint is detected, 
the processor will invoke the exception 1 handler. 
The exception is handled as a fault if an instruction 
execution breakpoint occurred, or as a trap if a data 
breakpoint occurred. 


IMPORTANT NOTE: A flag Bi is set whenever the 
hardware detects a match condition on enabled 
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breakpoint i. Whenever a match is detected on at 
least one enabled breakpoint i, the hardware imme- 
diately sets all Bi bits corresponding to breakpoint 
conditions matching at that instant, whether enabled 
or not. Therefore, the exception 1 handler may see 
that multiple Bi bits are set, but only set Bi bits corre- 
sponding to enabled breakpoints (Li or Gi set) are 
true indications of why the exception 1 handler was 
invoked. 


BD (debug fault due to attempted register access 
when GD bit set) 


This bit is set if the exception 1 handler was invoked 
due to an instruction attempting to read or write to 
the debug registers when GD bit was set. If such an 
event occurs, then the GD bit is automatically 
cleared when the exception 1 handler is invoked, 
allowing handler access to the debug registers. 


BS (debug trap due to single-step) 


This bit is set if the exception 1 handler was invoked 
due to the TF bit in the flag register being set (for 
single-stepping). 


BT (debug trap due to task switch) 


This bit is set if the exception 1 handler was invoked 
due to a task switch occurring to a task having a 486 
Microprocessor TSS with the T bit set. Note the task 
switch into the new task occurs normally, but before 
the first instruction of the task is executed, the ex- 
ception 1. handler is invoked. With respect to the 
task switch operation, the operation is considered to 
be a trap. 


9.3.4 USE OF RESUME FLAG (RF) IN FLAG 
REGISTER 7 


The Resume Flag (RF) in the flag word can sup- 
press an instruction execution breakpoint when the 
exception 1 handler returns to a user program ata 
user address which is also an instruction execution 
breakpoint. 


10.0 INSTRUCTION SET SUMMARY 


This section describes the 486 microprocessor in- 
struction set. Tables 10.1 through 10.3 list all in- 
structions along with instruction encoding diagrams 
and clock counts. Further details of the instruction 
encoding are then provided in Section 10.2, which 
completely describes the encoding structure and the 
definition of all fields occurring within the 486 micro- 
processor instructions. | 
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10.1 i486 Microprocessor | 
Instruction Encoding and Clock 
- Count Summary | 


To calculate elapsed time for an instruction, multiply 
_ the instruction clock. count, as listed in Tables 10.1 

through 10.3 by the processor clock period (e.g., 
40 ns for a 25 MHz 486 microprocessor). 


For more detailed information on the encodings of 
instructions, refer to Section 10.2 Instruction Encod- 
ings. Section 10.2 explains the general structure of 
instruction encodings, and defines exactly the en- 
codings of all fields contained within the instruction. 


INSTRUCTION CLOCK COUNT ASSUMPTIONS 


The 486 microprocessor instruction clock count ta- 
bles give clock counts assuming data and instruction 
accesses hit in the cache. A separate penalty col- 
umn defines clocks to add if a data access misses in 
the cache. The combined instruction and data cache 
hit rate is over 90%. | 


A cache miss will force the 486 microprocessor to 
run an external bus cycle. The 486 microprocessor 
32-bit burst bus is defined as r—b—w. 


Where: 


r = The number of slacks in the first cycle of a 
~ burst read or the number of clocks per data 
cycle in a non-burst read. 


b = The number of clocks for the second and di: 
sequent cycles in a burst read. 


w = The number of clocks for a write. 


The fastest bus the 486 microprocessor can support 
is2-—1-2 assuming 0 wait states. The clock. counts 
in the cache miss penalty column assume a 2—1—2 
bus. For slower busses add r—2 clocks to the cache 
miss penalty for the first dword accessed. Other fac- 
tors also affect instruction clock counts. 


Instruction Clock Count Assumptions 


1. The external bus is available for reads or writes at 
all times. Else add clocks to reads until the bus is 
available. 


2. Accesses are aligned. Add mice eens 0 each 
misaligned access. 


3. Cache fills complete before pibsealeuk accesses 
to the same line. If a read misses the cache dur- 
inga cache fill due to a previous read or pre-fetch, 
~ the read must wait for the cache fill to complete. If 
a read or write accesses a cache line still being 
filled, it must wait for the fill to complete. 


4. 
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If an effective address is calculated, the base 
register is not the destination register of the pre- 
ceding instruction. If the base register is the des- 
tination register of the preceding instruction add 
1 to the clock counts shown. Back-to-back 
PUSH and POP instructions are not affected by 
this rule. 


. An effective address ecicuiation uses one base 


register and does not use an index register. 
However, if the effective address calculation 
uses an index register, 1 clock may be added to 
the clock count shown. 


. The target of a jump is in the cache. If not, add r 


clocks for accessing the destination instruction 
of a jump. If the destination instruction is not 
completely contained in the first dword read, add 
a maximum of 3b clocks. If the destination in- 
struction is not completely contained in the first 
16 byte burst, add a maximum of another r+3b 
clocks. 


. If no write buffer delay, w clocks are added only 


in the case in which all write buffers are full. Typi- 
cally, this case rarely occurs. 


. Displacement and immediate not used together. 


If displacement and immediate used together, 1 
clock may be added to the clock count shown. 


. No invalidate cycles. Add a delay of 1 clock for 


each invalidate cycle if the invalidate cycle con- 
tends for the internal cache/external bus when 


the 486 CPU needs to use it. 


10. 


Page translation hits in TLB. A TLB miss will add 


13, 21 or 28 clocks to the instruction depending 
on whether the Accessed and/or Dirty bit in nei- 


i. ther, one or both of the page entries needs to be 


11. 


12. 


set in memory. This assumes that neither page 
entry is in the data cache and a page fault does 
not occur on the address translation. 


No exceptions are detected during instruction 
execution. Refer to Interrupt Clock Counts Table | 
for extra clocks if an interrupt is detected. 


Instructions that read multiple consecutive data 
items (i.e. task switch, POPA, etc.) and miss the 
cache are assumed to start the first access on a 
16-byte boundary. If not, an. extra cache line fill 


~ may be necessary which may add up to (r+ 3b) 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary 


Penalty if 


INTEGER OPERATIONS 
MOV = Move: 


regi to reg2 1000100W |11_ regi reg2 


reg2 to reg1 1000101w ]}|11 regi reg2 


memory to reg 1000101w [mod reg r/m 


reg to memory 1000100w | mod reg r/m 


immediate to reg 1100011w |11000 reg | immediate data 


or 1011w reg immediate data 


displacement 


mod 000 r/mj. ; 
immediate 


immediate to Memory 1100011iw 


Memory to Accumulator 1010000w | full displacement 


Accumulator to Memory 1010001w | full displacement 

MOVSX/MOVZX = Move with Sign/Zero Extension 
reg2 to reg 00001111 1011z11w {11 regi reg2 
memory to reg 00001111 1011z11Ww 


z___ instruction 


0 MOVZX 
1 MOVSX 


PUSH = Push 
reg 11111111 {11 110 reg 


or 01010 reg 


memory 11111111 |mod 110 r/m 


immediate 011010s0 | immediate data 


PUSHA = Push All 01100000 


POP = Pop 
reg 10001111 ]11 000 reg 


or 01011 reg 


memory 10001111 |mod 000 r/m 


POPA = Pop All | 01100001 


XCHG = Exchange 
regi with reg2 1000011w 


11  regl reg2 


Accumulator with reg 10010 reg 


Memory with reg 1000011w |mod reg r/m 


NOP = No Operation 10010000 


LEA = Load EA to Register 10001101 | mod reg r/m 
no index register 
with index register 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


Penalty if |. 


INSTRUCTION. . FORMAT 
INTEGER OPERATIONS (Continued) 


Instruction 


ADD = Add . 

ADC = Add with Carry 

AND = Logical AND 

OR = Logical OR 

SUB = Subtract 

SBB = Subtract with Borrow on 
XOR = Logical Exclusive OR 110 


regi to reg2 OOTTTOOW |11 —° regi reg2 


reg2 to reg 11  regl reg2 


memory to register OOTTTO1w |mod reg - r/m 


register to memory OOTTTOOW |mod reg r/m 


immediate to register 100000sw |ii TTT _ reg] immediate register | 


immediate to accumulator OOTTT10w | immediate data 


immediate to memory - 100000sw |mod TTT r/m| immediate data 


| 


Instruction 


INC = Increment 
DEC = Decrement 


reg 1111111w]711 TTT reg} 


oo 
a 5 
= 3 


or | O1TTT ~~ reg 


memory 11111711w [mod TTT r/m 


| q 


instruction 


NOT = Logical Complement 010 
NEG = Negate O11 


reg 1111011wf]11 TTT reg 


memory 1111011wW |mod TTT r/m 


CMP = Compare 


regi with reg2 0011100w |11 > regi reg2 


reg2 with reg1 0011101w 


11 regi reg2 
memory with register 0011100w jmod reg r/m 


register with memory 0011101W | mod reg r/m 


immediate with register | 100000sw ]11 111 + reg] immediate data 


immediate with acc. 0011110w | immediate data 


immediate with memory 100000sw | mod 111 1/m{ immediate data 


TEST = Logical Compare 


regi and reg2 1000010w {11° regi reg2 


memory and register 1000010w | mod reg r/m| 


immediate and register 1111011w {11 000 reg] immediate data 


immediate and acc. 1010100w | immediate data 


immediate and memory 1111011w |mod 000 1r/mj| immediate data 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


Penaity if : 
Cache Hit Cache Miss 


INSTRUCTION FORMAT 


INTEGER OPERATIONS (Continued) 
MUL = Multiply (unsigned) 


acc. with register 


Multiplier-Byte 
Word 
Dword 


acc. with memory 


Multiplier-Byte 
Word 
Dword 


IMUL = Integer Multiply (signed) 


acc. with register 


Multiplier-Byte 
Word 
Dword 


acc. with memory 
Multiplier-Byte 
Word 
Dword 


regi with reg2 


Multiplier-Byte 
Word 
Dword 


register with memory 


Multiplier-Byte 
Word 
Dword 


regi with imm. to reg2 


Multiplier-Byte 
Word 
Oword 


mem. with imm. to reg. 


Multiplier-Byte 
Word 
Dword 


DIV = Divide (unsigned) 


acc. by register 


Divisor-Byte 
Word 
Dword 


acc. by memory 


Divisor-Byte 
Word 
Dword 


IDIV = Integer Divide (signed) 


acc. by register 


Divisor-Byte 
Word 
Dword 


1111011W }11 


1111011w 


1111011W 


1111011Ww 


00001111 


00001111 


01101081 


011010s1 


11 


1111011w j11 


100 reg 


mod 100 r/m 


1 


oO 
® 
© 


mod 101 r/m 


10101111 


_ 
=~ 


10101111 


11 regi reg2 


reg1 reg2 | immediate data 


mod reg r/m 


110 reg 


immediate data 


1111011w 


mod 110 r/m 


11 


111 
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MN/Mx, 3 
MN/MX, 3 
MN/MX, 3 


MN/MxX, 3 | 
MN/MX, 3 
MN/Mx, 3 


MN/MxX, 3 
MN/MxX, 3 
MN/MxX, 3 


MN/MxX, 3 
MN/MX, 3 
MN/MX, 3 


MN/MX, 3 
MN/MxX, 3 
MN/Mx, 3 


MN/MxX, 3 
MN/MxX, 3 
MN/MxX, 3 


MN/Mx, 3 
MN/MxX, 3 
MN/MxX, 3 


MN/MxX, 3 
MN/MxX, 3 
MN/MX, 3 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) | 


Penaity if. 
chan | rome | Note 


INSTRUCTION - FORMAT 
| INTEGER OPERATIONS (Continued) 


acc. by memory 1111011Ww | mod 111 r/m 


Divisor-Byte , = | 20 

Word 28 

Dword . 44 

CBW = Convert Byte to Word 10011000 | | | 3 

CWD = Convert Word to Dword 10011001 _. 4 3 
Instruction TIT 
ROL = Rotate Left : 000 
ROR = Rotate Right 001 
RCL = Rotate through Carry Left 010 
RCR = Rotate through Carry Right — O11 
SHL/SAL = Shift Logical/Arithmetic Left 100 
SHR. = Shift Logical Right 101 
- SAR = Shift Arithmetic Right 111 


Not Through Carry (ROL, ROR, SAL, SAR, SHL, and SHR) 
reg by 1 1101000w [11 TTT reg ; 3 


memory by 1 —1101000w | mod TTT r/m | 4 6 


reg by CL | 1101001w 1/11 TTT reg —_ | : 3 


memory by CL 1101001w |mod TTT r/m 4 6 


reg by immediate count | 1100000w {11 TTT _ reg| immediate 8-bit data 2 


B.S 
o 


mem by immediate count 1100000w | mod TTT r/m| immediate 8-bit data 


Through Carry (RCL and RCR) 


| reg by 1 |} 1101000w }11 TTT reg 


Le) 


memory by 1 1101000w | mod TTT r/m 4 6 


reg by CL . 1101001w |11 TTT reg 3 | 8/30 _ | MN/MX, 4 


“memory by CL . 1101001w |mod TTT r/m | 9/31 - MN/MxX, 5 


reg by immediate count © immediate 8-bit data | 8/30 . MN/MxX, 4 
mem by immediate count . immediate 8-bit data | 9/31 ~ | MN/MX, 5 
instruction. TTT 
SHLD = Shift Left Double 100 
SHRD = Shift Right Double 101 
register with immediate imm 8-bit data 2 
memory by immediate imm 8-bit data 3 6 
register by CL 3 
memory by GL 4 | os 
| BSWAP = Byte Swap 00001111 111001 te 1 


XADD = Exchange and Add 
reg1, reg2 . 00001111 


1100000w {11 = reg2 regi , 3 


memory, reg 00001111 11100000w |mod reg r/mj- 4 6/2 


| CMPXCHG = Compare and Exchange 


regi, reg2, 00001111 |1011000w |11  reg2 regi] | 6 


memory, reg 00001111 41011000w [mod reg r/m 7/10 2 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


Penalty if 
INSTRUCTION FORMAT | cache ut | Cache Miss 


CONTROL TRANSFER (within segment) 


NOTE: Times are jump taken/not taken 


Jccc = Jump on ccc 


8-bit displacement 0111tttn 8-bit disp. T/NT, 23 


full displacement 00001111 1000tttn | full displacement T/NT, 23 


NOTE: Times are jump taken/not taken 


SETcccc = Set Byte on cccc (Times are cccc true/false) 


reg 00001111 1OO01tttn [11 000 reg 
memory 00001111 1001tttn | mod 000 r/m 


Mnemonic 
cccc 


Condition tttn 


Overflow 0000 

No Overflow 0001 

Below/Not Above or Equal 0010 

Not Below/Above or Equal 0011 
Equal/Zero 0100 

Not Equal/Not Zero 0101 

Below or Equal/Not Above 0110 

Not Below or Equal/Above 0111 

Sign 1000 

Not Sign 1001 

P/PE Parity/Parity Even 1010 
NP/PO Not Parity/Parity Odd 1011 
_L/NGE Less Than/Not Greater or Equal 1100 
NL/GE Not Less Than/Greater or Equal 1101 
LE/NG Less Than or Equal/Greater Than 1110 
NLE/G Not Less Than or Equal/Greater Than 1111 


LOOP = LOOP CX Times 11100010 8-bit disp. L/NL, 23 


LOOPZ/LOOPE = Loop with 11100001 


8-bit disp. L/NL, 23 
Zero/Equal : 


LOOPNZ/LOOPNE = Loop while 11100000 |  8-bitdisp. | L/NL, 23 
’ Not Zero 


JCXZ = Jump on CX Zero 41100011 8-bit disp. | T/NT, 23 


JECXZ = Jump on ECX Zero 11100011 
(Address Size Prefix Differentiates JCXZ for JECXZ) 


8-bit disp. T/NT, 23 


JMP = Unconditional Jump (within segment) 
Short 11101011 


8-bit disp. 


Direct 11101001 | full displacement 


Register Indirect 11111111 1/11 100 reg 


Memory Indirect 11111111 [mod 100 r/m 


CALL = Call (within segment) 
Direct 11101000 | full displacement 


Register Indirect 11111111 111 010 reg 


Memory Indirect 11111111 |mod 010 r/m 


RET = Return from CALL (within segment) 
11000011 


Adding Immediate to SP 11000010 16-bit disp. 
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Table 10.1. i486T Microprocessor Integer Clock Count Summary (Continued) 


INSTRUCTION FORMAT cache nit | Pematty't | Notes 
; Cache Miss 


CONTROL TRANSFER (within segment) (Continued) 


ENTER = Enter Procedure 11001000 16-bit disp., 8-bit leve! 
Level = 0 
Level = 1 
Level (L) > 1 : 

LEAVE = Leave Procedure 11001001 


MULTIPLE-SEGMENT INSTRUCTIONS 
MOV = Move | 


reg. to segment reg. 10001110 [11 sreg3 reg 
memory to segment reg. 10001110 | mod sreg3 r/m 


segment reg. to reg. 10001100 }|11 sreg3 reg 


segment reg. to memory 10001100 | mod sreg3 r/m 


PUSH = Push 


segment reg. 000sreg2110 
(ES, CS, SS, or DS) 


segment reg. (FS or GS) 00001111 [10 sreg3000 


POP = Pop 


segment reg. 
(ES, SS, or DS) | 


segment reg. (FS or GS) 00001111 110 sreg3001 


LDS = Load Pointer to DS 11000101 |mod reg r/m 


000sreg2 111 


f 


LES = Load Pointer to ES 11000100 |mod reg r/m 


10110100 


LGS = Load Pointer to GS - 00001111 | 10110101 


10110010 


LFS = Load Pointer to FS 00001111 


LSS = Load PointertoSS 00001111 


CALL = Call . 
Direct intersegment 10011010 | unsigned full offset, selector 


f 


to same level 

thru Gate to same level 

to inner level, no parameters 

to inner level, x parameter (d) words 


to TSS 
thru Task Gate . 

Indirect intersegment 11111111 [mod 011 r/m 
to same level , 


thru Gate to same level 
to inner level, no parameters 
to inner level, x parameter (d) words 
to TSS | 
thru Task Gate’ 
RET = Return from CALL 


intersegment ; 11001011 


to same level 
to outer level 


16-bit disp. 


intersegment adding 11001010 
imm. to SP 


to same level 
- to outer level 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


INSTRUCTION FORMAT renaltys. | Aisies 
Cache Miss 


MULTIPLE-SEGMENT INSTRUCTIONS (Continued) 
JMP = Unconditional Jump 


Direct intersegment 11101010 | unsigned full offset, selector 


to same level 

thru Call Gate to same level 
thru TSS 

thru Task Gate 


Indirect intersegment 11111111 |mod 101 r/m 


to same level 
thru Call Gate to same level 
thru TSS 
thru Task Gate 
BIT MANIPULATION 


BT = Test bit 


register, immediate 00001111 10111010 {11 100 reg} imm. 8-bit data 


memory, immediate 00001111 10111010 |mod 100 r/m| imm. 8-bit data 


regi, reg2 00001111 10100011 11  reg2 regi 
memory, reg 00001111 10100011 |mod reg r/m 


Instruction 
BTS = Test Bit and Set 101 


BTR = Test Bit and Reset 110 
BTC = Test Bit and Compliment 111 


register, immediate 00001111 10111010 {11 TTT = regj imm. 8-bit data 


memory, immediate 00001111 10111010 |mod TTT r/mj{ imm. 8-bit data 


regi, reg2 00001111 1O0TTTO11 4,11 = reg2 regi 
memory, reg 00001111 10TTTO11 |mod reg r/m 


BSF = Scan Bit Forward 


regi, reg2 00001111 10111100 }11 reg2 regi . MN/MxX, 12 
memory, reg 00001111 10111100 |mod reg r/m MN/Mx, 13 


BSR = Scan Bit Reverse 


regi, reg2 00001111 1011110141 111 reg2 regi MN/MX, 14 
memory, reg 00001111 10111101 |mod reg r/m MN/MX, 15 


STRING INSTRUCTIONS 
CMPS = Compare Byte Word 1010011w 


LODS = Load Byte/Word 1010110w 
to AL/AX/EAX ; 


MOVS = Move Byte/Word 1010010w 
SCAS = Scan Byte/Word 1010111w 


STOS = Store Byte/Word 1010101w 
from AL/AX/EX 


XLAT = Translate String 11010111 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


| ae Penalty if 


REPEATED STRING INSTRUCTIONS 
Repeated by Count in CX or ECX (C = Count in CX or ECX) 


REPE CMPS = Compare String 11110011 


(Find Non-Match) 
Cc=0 
cC>0 

REPNE CMPS = Compare String 11110010 |; 1010011w 
(Find Match) 

Cc=0 
Cc>0 

REP LODS = Load String 11110010 1010110w 
CcC=0 
C>0 . 

REP MOVS = Move String 11110010 | 1010010w 
Cc=0 
C=1 
C>1 ; 

REPE SCAS = Scan String 11110011 | 1010111w 
(Find Non-AL/AX/EAX) 
C=0 
cC>o0 

REPNE SCAS = Scan String 11110010 | 1010111w 
(Find AL/AX/EAX) , 

Cc=0 
C>o 

REP STOS = Store String 11110010 1010101Ww 
C=0 
c>0 

FLAG CONTROL 

CLC = Clear Carry Flag 11111000 

STC = Set Carry Flag 11111001 

CMC = Compiement Carry Flag 11110101 

CLD = Clear Direction Flag 11111100 

STD = Set Direction Flag 11111101 

CLI = Clear Interrupt 11111010 

Enable Flag 
STI = Set Interrupt 41111011 
Enable Flag 

LAHF = Load AH into Flag 10011111 

SAHF = Store AH into Flags ; 10011110 

PUSHF = Push Flags 10011100 

| POPF = Pop Flags 10011101 

DECIMAL ARITHMETIC 

AAA = ASCIi Adjust for Add 00110111 

AAS = ASCll Adjust for 00111111 

Subtract 
AAM = ASCli Adjust for 11010100 |; 00001010 


Multiply 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


Penalty if 


DECIMAL ARITHMETIC (Continued) 


AAD = ASCII Adjust for 11010101 00001010 


Divide 


DAA = Decimal Adjust for Add 00100111 


DAS = Decimal Adjust for Subtract 00101111 


PROCESSOR CONTROL INSTRUCTIONS 


HLT = Halt 11110100 


MOV = Move To and From Control/Debug/Test Registers 


CRO from register 00001111 11 000 reg 
CR2/CR3 from register 00001111 11 eee reg 
Reg from CRO-3 00001111 | 90100000 | 11 eee reg 
DRO-3 from register 00001111 11 eee reg 


DR6-7 from register 00001111 00100011 /11 eee reg 


Register from DR6-7 00001111 00100001 {11 eee reg 


Register from DRO-3 00001111 0010000% ;11 eee reg 
TR3 from register 00001111 00100110 411 011 reg 


TR4-7 from register 00001111 00100110 {11 eee reg 


Register from TR3 00001111 00100100 };11 O11 reg 


Register from TR4—7 00001111 00100100 }11 eee reg 


CLTS = Clear Task Switched Flag 00001111 00000110 


INVD = invalidate Data Cache 00001111 00001000 . 


WBINVD = Write-Back and Invalidate; 00001111 00001001 
Data Cache 


INVLPG = Invalidate TLB Entry 
INVLPG memory 00001111 


00000001 |mod 111 r/m 


PREFIX BYTES 
Address Size Prefix . {01100111 
LOCK = Bus Lock Prefix 11110000 
Operand Size Prefix 01100110 


‘| Segment Override Prefix 
CS: 00101110 


DS: 00111110 
ES: 00100110 


FS: 1 01100100 


GS: 01100101 


SS: 00110110 
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Table 10.1. i486™ Microprocessor Integer Clock Count Summary (Continued) 


7 Penalty if ers 
PROTECTION CONTROL | . ae a 
ARPL = Adjust Requested Privilege Level 


From register °01100011 111 regl reg2 
From memory 01100011 


LAR = Load Access Rights 


From register 


From memory . 00001111 


LGDT = Load Global Descriptor 


Table register 00001111 00000001 mod 010 r/m 


LIDT = Load Interrupt Descriptor 


Table register 00001111 00000001 |mod 011 r/m 
LLDT = Load Local Descriptor 


Table register from reg. 00001111 00000000 {11 010 ‘reg 


Table register from mem. 00001111 00000000 | mod 01 0 r/m 


LMSW = Load Machine Status Word . 


From register 00001111 00000001 {11 110 reg 
From memory © 00001111 | 00000001 |mod 110 r/m 


LSL = Load Segment Limit 


From register 00001111 00000011 |11 regi reg2 
From memory ‘| 00001111 | 00000011 |mod reg r/m 


|LTR = Load Task Register 


From Register 00001111 00000000 }11 O11 reg 


From Memory 00001111 00000000 {mod 011 r/m 


SGDT = Store Global Descriptor Table 


00001111 00000001 |mod 000 r/m 


00001111 | 00000001 |mod 001 r/m 
SLDT = Store Local Descriptor Table | 


To register 00001111 | 00000000 {11 000 reg 
’ To memory 00001111 00000000 |mod 000 r/m 


SMSW = Store Machine Status Word | 


To register 0000111.1 00000001 {11 100 reg 
To memory 00001111 00000001 |mod 100 r/m 


STR = Store Task Register 
To register 00001111 00000000 {11 O01 re 


To memory 00001111 | 00000000 | mod 001 f/m 
VERR = Verify Read Access 


Register 00001111 00000000 {171 
Memory 00001111 00000000 |mod 100 r/m 


| VERW = Verify Write Access 
To register | 00001111 | 00000000 |11 10 


To memory 00001111 | 00000000 | mod 101 r/m 
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Table 10.1. i486T™ Microprocessor Integer Clock Count Summary (Continued) 


INSTRUCTION 
INTERRUPT INSTRUCTIONS 
INT n = Interrupt Type n 
INT 3 = Interrupt Type 3 


INTO = Interrupt 4 if 
Overflow Flag Set 
Taken 
Not Taken 


BOUND = Interrupt 5 if Detect 
Value Out Range 


If in range 
If out of range 


IRET = Interrupt Return 


Real Mode/Virtual Mode 
Protected Mode 
To same level 
To outer level 
To nested task (EFLAGS.NT = 1) 


External Interrupt 
NMI = Non-Maskable Interrupt 
Page Fault 


VM86 Exceptions 

CLI 

STI 

INT n 

PUSHF 

POPF 

IRET 

IN 
Fixed Port 
Variable Port 

OUT 
Fixed Port 
Variable Port 

INS 

OUTS 

REP INS 

REP OUTS 


VM/486 CPU/286 TSS To 486 CPU TSS 


i486™ MICROPROCESSOR 


FORMAT 


11001101 


11001100 


11001110 


01100010 


11001111 


Task Switch Clock Counts Table 


mod 


Method 


reg fr 


/ 


m 


VM/486 CPU/286 TSS To 286 TSS 
VM/486 CPU/286 TSS To VM TSS 
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Penalty if 
cache Hit Cache Miss | Notes | 


INT + 4/0 


INT +0 


7. 
INT + 24 


36 
TS +32 


INT +11 
INT+3 
INT + 24 


INT+8 
INT+8 
INT +9 
INT+9 
_INT+8 
INT+9 


INT + 50 
INT + 51 


INT + 50 
INT + 51 
INT + 50 
INT + 50 
INT + 51 
INT + 51 


Value for TS 


162 
143 
140 


Miss Penalty 


55 
31 
37 | 


RV/P, 21 


21 
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os Interrupt Clock Counts Table | | : | 
; _ Value for INT | 
| CacheHit | MissPenaity | Notes 


2 


Real Mode 
Protected Mode 
Interrupt/Trap gate, same level 6 


Interrupt/Trap gate, different level 
Task Gate 


Virtual Mode 


Interrupt/Trap gate, different level 
Task gate 

Abbreviations Definition 

16/32 16/32 bit modes 

U/L unlocked/locked 

MN/MX minimum/maximum 

L/NL loop/no loop 

RV/P real and virtual mode/protected mode 

R | real mode 25 

P protected mode 

T/NT taken/not taken 

H/NH |. hit/no hit 

NOTES: | 

1. Assuming that the operand address and stack address fall in different cache sets. 


2. Always locked, no cache hit case. 
3. Clocks = 10 + max(logo(|ml),n) 
m= multiplier value (min clocks for m=0) - 
n= 3/5 for +m | 
4. Clocks = {quotient(count/operand length)}*7+9 
= 8 if count < operand length (8/16/32) 
. Clocks = {quotient(count/operand length)}*7+9 
= 9 if count < operand length (8/16/32) 
. Equal/not equal cases (penalty is the same regardless of lock). ° 
. Assuming that addresses for memory read (for indirection), stack push/pop, and branch fall in different cache sets. 
. Penalty for cache miss: add 6 clocks for every 16 bytes copied to new stack frame. : 
. Add 11 clocks for each unaccessed descriptor load. 
10. Refer to task switch clock counts table for value of TS. 
11. Add 4 extra clocks to the cache miss penalty for each 16 bytes. 
For notes 12-13: (b = 0-3, non-zero byte number); 
(i = O-1, non-zero nibble number); 
(n = 0-3, non bit number in nibble); 


OOND oO 


12. Clocks = 8+4 (b+1) + 3(i+1) + 3(n+1) 
= 6 if second operand = 0 
13. Clocks = 9+ 4(b+1) + 3(i+1) + 3(n+1) 


7 if second operand = 0 
For notes 14-15: (n = bit position 0-31). 
14. Clocks = 7 + 3(82—n) 
6 if second operand = 0 
15. Clocks = 8 + 3(32—n) 
7 if second operand = 0 
16. Assuming that the two string addresses fail in different cache sets. 
17. Cache miss penalty: add 6 clocks for every 16 bytes compared. Entire penalty on first compare. 
18. Cache miss penalty: add 2 clocks for every 16 bytes of data. Entire penalty on first load. 
19. Cache miss penalty: add 4 clocks for every 16 bytes moved. 
(1 clock for the first operation and 3 for the second) 
20. Cache miss penalty: add 4 clocks for every 16 bytes scanned. 
(2 clocks each for first and second operations) 
21. Refer to interrupt clock counts table for value of INT | 
22. Clock count includes one clock for using both displacement and immediate. 
23. Refer to assumption 6 in the case of a cache miss. 
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Table 10.2. i486™ Microprocessor I/O Instructions Clock Count Summary 


Protected | Protected 

INSTRUCTION FORMAT ie Mode Mode belch Notes 
(CPL<IOPL)|(CPL>IOPL) 

I/O INSTRUCTIONS 


IN = Input from: 


Fixed Port 1110010w 


Variable Port 1110110w 


OUT = Output to: 


Fixed Port 1110041w 


Variable Port 1110111w 


INS = Input Byte/Word 
from DX Port 


OUTS = Output Byte/Word 0110111w 
to DX Port 


REP INS = Input String 11110010 | 0110110w 
REP OUTS = Output String 11110010 | 0110111Ww 


NOTES: 

1. Two clock cache miss penalty in all cases. 

2. c = count in CX or ECX. 

3. Cache miss penalty in all modes: Add 2 clocks for every 16 bytes. Entire penalty on second operation. 
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Table 10.3. i486™ Microprocessor Floating Point Clock Count Summary 


Concurrent 
. Penalty if Execution 
INSTRUCTION FORMAT | Avg (Lower Cache Miss Avg (Lower | Notes 


Range... Range... | 
Upper Range) Upper Range) 


DATA TRANSFER 
FLD = Real Load to ST(0) 
32-bit memory 


64-bit memory 
80-bit memory 
ST(i) 


FILD = Integer Load to ST(0) 
16-bit memory 14.5(13-16) - 4 


32-bit memory 11.5(9—12) — 4(2-4) 


64-bit memory 16.8(10—18) 7.8(2-8) 


FBLD = BCD Load to ST(0) i 75(70—103) 7.7(2-8) 


FST = Store Real from ST(0) 
32-bit memory 


64-bit memory 
ST(i) 
FSTP = Store Real from ST(0) and Pop 
32-bit memory 
64-bit memory 
80-bit memory 
STii) 


FIST = Store Integer from ST(0) : 
16-bit memory i 33.4(29-34) 


32-bit memory .| 932.4(28-34) 


16-bit memory 33.4(29-34) 
32-bit memory 33.4(29-34) 
64-bit memory 33.4(29-34) 


FBSTP = Store BCD from 175(172-176) | 
ST(0) and Pop 


FXCH = Exchange ST(0) and ST(i) 
COMPARISON INSTRUCTIONS 


FCOM = Compare ST(0) with Real 
32-bit memory 11011 000 s-i-b/disp. 


64-bit memory 11011 100 s-i-b/disp. 
ST(i) 11011 000 ST(i) 


FCOMP = Compare ST(0) with Real and Pop 
32-bit memory 11011 000}mod 011 r/m 


64-bit memory 11011 100 


ST(i) 11011 OOO 
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Table 10.3. i486™ Microprocessor Floating Point Clock Count Summary (Continued) 
Concurrent 
Cache Hi 
acer Penalty if Execution 


INSTRUCTION FORMAT Avg (Lower Cache Miss Avg (Lower | Notes 
Range... Range... 
Upper Range) | Upper Range) 


COMPARISON INSTRUCTIONS (Continued) 


FCOMPP = Compare ST(0) with 
ST(1) and Pop Twice 


FICOM = Compare ST(0) with Integer 
16-bit memory 11011 110 18(16~20) 


32-bit memory 11011 010] 16.5(15-17) 


16-bit memory s-i-b/disp. | 18(16-20) 
32-bit memory s-i-b/disp. 16.5(15-17) 
FTST = Compare ST(0) with 0.0 


FUCOM = Unordered compare 
ST(0) with ST(i) 


FUCOMP = Unordered compare ST(i) 
_ ST(0) with ST(i) and Pop 


11101 1001 


FXAM = Examine ST(0) 1110 0101 


CONSTANTS 
FLDZ = Load + 0.0 into ST(0) 


FLD1 = Load + 1.0 into ST(0) 


FLDPI = Load 7 into ST(0) 

FLDL2T = Load logo(10) into ST(0) 

| FLDL2E = Load logo(e) into ST(0) 

FLDLG2 = Load logj0(2) into ST(0) 11011 
FLDLN2 = Load log,(2) into ST(0) 11011 OO71 
ARITHMETIC 


FADD = Add Real with ST(0) 
ST(0) <~ ST(0) + 32-bit memory 111011 000 s-i-b/disp. 10(8-20) 7(5-17) 


ST(0) <~ ST(0) + 64-bit memory 110711 100] s-i-b/disp. 10(8-20) 7(5-17) 


ST(d) <— ST(0) + ST(i) “1711011 d00};11000 = ST(i 10(8-20) 7(5-17) 


FADDP = Add real with ST(0) and 10(8-20) 7(5-17) | 
Pop (ST(i) <— ST(0) + ST(i)) 


FSUB = Subtract real from ST(0) 
ST(0) <— ST(0) — 32-bit memory s-i-b/disp. 10(8-20) 7(5-17) 


ST(0) <— ST(0) — 64-bit memory s-i-b/disp. 10(8-20) 7(5-17) 


ST(d) <— ST(0) — ST(i) . 10(8-20) 7(5-17) 


FSUBP = Subtract real from ST(0) 10(8-20) 7(5~-17) 
and Pop (ST(i) <— ST(0) — ST(i)) 
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Table 10.3. i486™ Microprocessor Floating Point Clock Count Summary (Continued) 


[comm | 
Penalty if Execution 
INSTRUCTION | a FORMAT : Avg (Lower | Cache Miss Avg (Lower | Notes 


Range... Range... 
Upper Range) Upper Range) 


ARITHMETIC (Continued) 
FSUBR = Subtract real reversed (Subtract ST(0) from real) 
ST(0) <— 32-bit memory — ST(0) -10(8-20) 7(5-17) 


ST(0) <— 64-bit memory — ST(0) 10(8-20) 7(5-17) 


ST(d) <— ST(i) — ST(0) | . a 10(8-20) 7(5-17) 


FSUBRP = Subtract real reversed (i) . 10(8-20) 7(5-17) 
and Pop (ST(i) — ST(i) — ST(0)) os 


| FMUL = Multiply real with ST(0) 
ST(Q) <— ST(O) X 32-bit memory 11011 000 


ST(0) <— ST(0) < 64-bit memory 11011 100{mod 001 f/m 
ST(d) <— ST(0) x ST(i) - 11011 d00 ST(i) 


FMULP = Multiply ST(0) with ST(i) [11011 110 
and Pop (ST(i) < ST(0) x ST(i) 


FDIV = Divide ST(0) by Real 
ST(0) <— ST(0)/32-bit memory 11011 


 g--b/disp. 


ST(0) <— ST(0)/64-bit memory ‘141011 100 s-i-b/disp. 


' ST(d) — ST(O)/ST(i) 11011 d00 
FDIVP = Divide ST(0) by ST(i)and = }11011 110 
Pop (ST(i) <— ST(0)/ST(i)) 
FDIVR = Divide real reversed (Real/ST(0)) 
ST(O) <— 32-bit memory/ST(0) 11011 000]mod 111 r/m s-i-b/disp. 


ST(0) <— 64-bit memory/ST(0) 11011 100]mod 111 r/m s-i-b/disp. 


ST(d) <—— ST(i)/ST(0) 11011 d00j;11110 ST} 
|FDIVRP = Divide realreversedand |11011 110 
Pop (ST(i) <— ST(i)/ST(0)) 

FIADD = Add Integer to ST(0) 
ST(0) <— ST(0) + 16-bit memory 41011 110 


11110 ST(i) 


mod 000 r/m s-i-b/disp. 24(20-35) ; 7(5-17) 


ST(0) <— ST(0) + 32-bit memory 11011 010]/mod 000 r/m s-i-b/disp. 22.5(19-32) ; 7(5-17) 


FISUB = Subtract Integer from ST(0) 


ST(0) <— ST(0) — 16-bit memory 11011 110!mod 100 f/m s-i-b/disp. 24(20-35) 2 7(5-17) 


ST(0) <— ST(0) — 32-bit memory 141011 010}mod 100 r/m 


s-i-b/disp. 22.5(19-32) 7(5~17) 


FISUBR = Integer Subtract Reversed 


ST(0) <— 16-bit memory — ST(0) 11011 110]/mod 101 r/m s-i-b/disp. * 24(20~35) . 7(5-17) 


ST(0) <— 32-bit memory — ST(0) 11011 010}mod 101 f/m s-i-b/disp. 22.5(19-32) | 7(5—17) 


FIMUL. = Multiply Integer with ST(0) 


ST(0) <— ST(O) X 16-bit memory 11011 110}mod 001 f/m s-i-b/disp. 25(23-27) 


ST(0) <— ST(O) X 32-bit memory 11011 010}mod 001 r/m s-i-b/disp. 23.5(22-24) 


FIDIV = integer Divide 


~ ST(0) <— ST(0)/16-bit memory 11011 110})mod 110 r/m s-i-b/disp. 87(85-89) 


ST(0) <— ST(0)/32-bit memory 11011 010)mod 110 f/m s-i-b/disp. 85.5(84-86) 


5-150 


intel i486™ MICROPROCESSOR 


Table 10.3. i486T™ Microprocessor Floating Point Clock Count Summary (Continued) 


— 
; Penalty if Execution 
INSTRUCTION FORMAT Avg (Lower Cache Miss Avg (Lower | Notes 


Range... Range... 
Upper Range) | Upper Range) 


ARITHMETIC (Continued) 
FIDIVR = Integer Divide Reversed 


ST(0) <— 16-bit memory/ST(0) 87(85-89) 

ST(0) <— 32-bit memory/ST(0) 11011 010 85.5(84-86) 
FSQRT = Square Root 11011 001 85.5(83-87) 
FSCALE = Scale ST(0) by ST(1) 11011 001 31(30-32) 


FXTRACT = Extract components 11011 O01 19(16-20) 4(2-4) 
of ST(0) 


FPREM = Partial Reminder 11011 O01 84(70-138) 2(2-8) 


FPREM1 = Partial Reminder (IEEE) 11011 O01 94.5(72-167) 5.5(2-18) 


FRNDINT = Round ST(0)tointeger [11011 00111111 1100 29.1(21-30) 7.4(2—8) 


FABS = Absolute value of ST(0) 11011 0011/1110 0001 | 3 


_|FCHS = Change sign of ST(0) 11011 00111110 0000 6 


TRANSCENDENTAL 
FCOS = Cosine of ST(0) 11011 OO1;/1111 1111 241(193-279) 


FPTAN = Partial tangent of ST(0) 110711 0011/1111 0010 244(200-273) 


FPATAN = Partial arctangent 11011 0014/1111 0011 289(218-303) 5(2-17) 


FSIN = Sine of ST(0) 11011 O0O1;/1111 #1110 241(193-279) 


17117 #1011 291 (243-329) 


FSINCOS = Sine and cosine of ST(0) 44041 oot/1111 1011, 


Foxmi = 257(0) _ 4 41011 00111111 0000 242(140-279) 
FYL2X = ST(1) X logo(ST(0)) _ 141011 0014/1111 0001 311(196-329) 


FYL2XP1 = ST(1) X logo(ST(0) + 1.0)}11011 00141111 1001 313(171-326) 


PROCESSOR CONTROL 
FINIT = Initialize FPU 41011 01111110 0011 


FSTSW AX = Store status word 141011 1411/1110 0000 
into AX 


FSTSW = Store status word 11011 101}mod 111 r/m s-i-b/disp. 
into memory 


FLDCW = Load control word 11011 001{mod 101 r/m s-i-b/disp. 


FSTCW = Store control word 111011 001]mod 111 r/m 


s-i-b/disp. 


FCLEX = Clear exceptions 11011 0117/1110 0010 


FSTENV = Store environment }11011 001 
Real and Virtual modes 16-bit Address 
Real and Virtual modes 32-bit Address 
Protected mode 16-bit Address 
Protected mode 32-bit Address 


mod 110 r/m s-i-b/disp. 


FLDENV = Load environment 11011 001}mod 100 r/m 
Real and Virtual modes 16-bit Address 

Real and Virtual modes 32-bit Address 

Protected mode 16-bit Address 


Protected mode 32-bit Address 


s-i-b/disp. 
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PROCESSOR CONTROL (Continued) 


FSAVE = Save state 
Real and Virtual modes 16-bit Address 

' Real and Virtual modes 32-bit Address 
Protected mode 16-bit Address 

Protected mode 32-bit Address 


FRSTOR = Restore state 
Real and Virtual modes 16-bit Address 
Real and Virtual modes 32-bit Address 
Protected mode 16-bit Address 
Protected mode 32-bit Address 


FINCSTP = Increment Stack Pointer |11011 0011/1111 0111 
FDECSTP = Decrement Stack Pointer} 11011 0011/1111 0110 
11011 101711000 = ST(i) 


11014 00171101 0000 


FFREE = Free ST(i) 


FNOP = No operations 


WAIT = Wait until FPU ready 
(Minimum/Maximum) 


10011011 


NOTES: : 
. lf operand is 0 clock counts = 27. 
. If operand is 0 clock counts = 28. : 


Table 10.3. i486T™ Microprocessor Floating Point Clock Count Summary (Continued) | 
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Concurrent 
g Penalty if Execution 
Avg (Lower | Cache Miss | Avg (Lower | Notes 
Range... | Range... 
Upper Range) Upper Range) 


11011 101}]mod 110 r/m s-i-b/disp. 
11011 101|mod 100 r/m| sibs | | 


lf CW.PC indicates 24 bit precision then subtract 38 clocks. - \ 
If CW.PC indicates 53 bit precision then subtract 11 clocks. 

. If there is a numeric error pending from a previous instruction add 17 clocks. 

. If there is a numeric error pending from a previous instruction add 18 clocks. 

The INT pin is polled several times while this instruction is executing to assure short interrupt latency. 

. lf ABS(operand) is greater than 7/4 then add n clocks. Where n = (operand/(7/4)). 
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10.2 Instruction Encoding 


10.2.1 OVERVIEW 


All instruction encodings are subsets of the general 
instruction format shown in Figure 10.1. Instructions 
consist of one or two primary opcode bytes, possibly 
an address specifier consisting of the “mod r/m’” 
byte and “scaled index” byte, a displacement if re- 
quired, and an immediate data field if required. 


Within the primary opcode or opcodes, smaller en- 
coding fields may be defined. These fields vary ac- 
cording to the class of operation. The fields define 
such information as direction of the operation, size 
of the displacements, register encoding, or sign ex- 
tension. . | 


Almost all instructions referring to an operand in 
memory have an addressing mode byte following 
the primary opcode byte(s). This byte, the mod r/m 
byte, specifies the address mode to be used. Certain 


encodings of the mod r/m byte indicate a second 
addressing byte, the scale-index-base byte, follows 
the mod r/m byte to fully specify the addressing 
mode. —— 


Addressing modes can include a displacement im- 
mediately following the mod r/m byte, or scaled in- 
dex byte. If a displacement is present, the possible 
sizes are 8, 16 or 32 bits. 7 


_\f the instruction specifies an immediate operand, 


the immediate operand follows any displacement 
bytes. The immediate operand, if specified, is always 


the last field of the instruction. 


Figure 10.1 illustrates several of the fields that can 
appear in an instruction, such as the mod field and 


the r/m field, but the Figure does not show all fields. 


Several smaller fields also appear in certain instruc- 
tions, sometimes within the opcode bytes them- 
selves. Table 10.4 is a complete list of all fields ap- 
pearing in the 486 Microprocessor instruction set. 
Further ahead, following Table 10.4, are detailed ta- 
bles for each field. | 
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TTTTTTTT|TTTTTTTT| modTTTr/m| ss index base |d32| 16 | 8 | none data32 | 16 | 8 | none 


0.765320 ,.765320 


a ee 


address immediate 
displacement data 
(4, 2, 1 bytes (4, 2, 1 bytes 
or none) or none) 


opcode “mod r/m” “g-i-b” 

(one or two bytes) byte byte 
(T represents an 
opcode bit.) register and address 


mode specifier 


Figure 10.1. General Instruction Format 


Table 10.4. Fields within i486™ Microprocessor Instructions 


Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits 


Specifies Direction of Data Operation 


Specifies if an Immediate Data Field Must be Sign-Extended 


reg General Register Specifier 
mod r/m 


Address Mode Specifier (Effective Address can be a General Register) 


2 for mod; 
3 forr/m 


ss Scale Factor for Scaled Index Address Mode 


index 


General Register to be used as Index Register 


base General Register to be used as Base Register 


sreg2 
sreg3 


Segment Register Specifier for CS, SS, DS, ES 
Segment Register Specifier for CS, SS, DS, ES, FS, GS 


tttn For Conditional Instructions, Specifies a Condition Asserted 


or a Condition Negated 


‘NOTE: 
Tables 10.1-10.3 show encoding of individual instructions. 


10.2.2 32-BIT EXTENSIONS OF THE . 
INSTRUCTION SET | 


With the 486 Microprocessor, the 8086/80186/ 
80286 instruction set is extended in two orthogonal 
directions: 32-bit forms of all 16-bit instructions are 
added to support the 32-bit data types, and 32-bit 
addressing modes are made available for all instruc- 
tions referencing memory. This orthogonal instruc- 
tion set extension is accomplished having a Default 
(D) bit in the code segment descriptor, and by hav- 
ing 2 prefixes to the instruction set. 


Whether the instruction defaults to operations of 16 
bits or 32 bits depends on the setting of the D bit in 
the code segment descriptor, which gives the de- 
fault length (either 32 bits or 16 bits) for both oper- 
ands and effective addresses when executing that 
code segment. In the Real Address Mode or Virtual 
8086 Mode, no code segment descriptors are used, 
but a D value of 0 is assumed internally by the 486 


Microprocessor when operating in those modes (for 
16-bit default sizes compatible with the 8086/ 
80186/80286). 


Two prefixes, the Operand Size Prefix and the Effec- 
tive Address Size Prefix, allow overriding individually 
the Default selection of operand size and effective 
address size. These prefixes may precede any op- 
code bytes and affect only the instruction they pre- 
cede. If necessary, one or both of the prefixes may 
be placed before the opcode bytes. The presence of 
the Operand Size Prefix and the Effective Address 
Prefix will toggle the operand size or the effective 
address size, respectively, to the value ‘‘opposite”’ 
from the Default setting. For example, if the default 
operand size is for 32-bit data operations, then pres- 
ence of the Operand Size Prefix toggles the instruc- 
tion to 16-bit data operation. As another example, if 
the default effective address size is 16 bits, pres- 
ence of the Effective Address Size prefix toggles the 
instruction to use 32-bit effective address computa- 
tions. 
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These 32-bit extensions are available in all 486 Mi- 
croprocessor modes, including the Real Address 
Mode or the Virtual 8086 Mode. In these modes the 
default is always 16 bits, so prefixes are needed to 
specify 32-bit operands or addresses. For instruc- 
tions with more than one PIS, the order of prefixes 
is unimportant. 


Unless specified otherwise, instructions with 8-bit 
and 16-bit operands do not affect the contents of 
the high-order bits of the extended registers. 


10.2.3 ENCODING OF INTEGER 
INSTRUCTION FIELDS 


Within the instruction are several fields indicating 
register selection, addressing mode and so on. The 
exact encodings of these fields are defined immedi- 
ately ahead. 


10.2.3.1 Encoding of Operand Length (w) Field 


For any given instruction performing a data opera- 
tion, the instruction is executing as a 32-bit operation 


or a 16-bit operation. Within the constraints of the. 


operation size, the w field encodes the operand size 
as either one byte or the full pperaten: size, as 
shown in the table below. 


Operand Size 
During 32-Bit 
Data Operations 


Operand Size 
During 16-Bit 
Data Operations 


0 8 Bits 8 Bits 
ye a - 16Bits | 32 Bits 


10.2.3.2 Encoding of the General 
Register (reg) Field 


The general register is specified by the reg field, 
which may appear in the primary opcode bytes, or as 
the reg field of the “mod r/m’” byte, or as the r/m 
field of the “mod r/m”’ byte. 


Encoding of reg Field When w Field 
is not Present in Instruction 


Register Selected | Register Selected 
During 16-Bit 
Data Operations 


During 32-Bit 
Data Operations 


Encoding of reg Field When w Field 
is Present in Instruction 


Register Specified by reg Field 
During 16-Bit Data Operations: 


Function of w Field 


Register Specified by reg Field 
During 32-Bit Data Operations 


Function of w Field 


ea [n= 
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10.2.3.3 Encoding of the Segment 
Register (sreg) Field 


The sreg field in certain instructions is a 2-bit field 
allowing one of the four 80286 segment registers to 
be specified. The sreg field in other instructions is a 
3-bit field, allowing the 486 Microprocessor FS and 
GS segment registers to be specified. 


2-Bit sreg2 Field 


Segment 
Register 
Selected 


2-Bit 
sreg2 Field 


3-Bit sreg3 Field 


Segment 
Register 
Selected 


3-Bit 
sreg3 Field 


do not use 
do not use 
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10.2.3.4 Encoding of Address Mode 


Except for special instructions, such as PUSH or 
POP, where the addressing mode is pre-determined, 
the addressing mode for the current instruction is 
specified by addressing bytes following the primary 
opcode. The primary addressing byte is the “mod 
r/m’’ byte, and a second byte of addressing informa- 
tion, the ‘‘s-i-b” (scale-index-base) byte, can be 
specified. 


The s-i-b byte (scale-index-base byte) is specified 
when using 32-bit addressing mode and the “mod 
r/m” byte has r/m = 100 and mod = 00, 01 or 10. 
When the sib byte is present, the 32-bit addressing 
mode is a function of the mod, ss, index, and base 
fields. 


The primary addressing byte, the “mod r/m” byte, 
also contains three bits (shown as TTT in Figure 
10.1) sometimes used as an extension of the pri- 
mary opcode. The three bits, however, may also be 
used as a register field (reg). 


When calculating an effective address, either 16-bit 
addressing or 32-bit addressing is used. 16-bit ad- 
dressing uses 16-bit address components to calcu- 
late the effective address while 32-bit addressing 
uses 32-bit address components to calculate the ef- 
fective address. When 16-bit addressing is used, the 
“mod r/m’’ byte is interpreted as a 16-bit addressing 
mode specifier. When 32-bit addressing is used, the 
“mod r/m’”’ byte is interpreted as a 32-bit addressing 
mode specifier. 


Tables on the following three pages define all en- 
codings of all 16-bit addressing modes and 32-bit 
addressing modes. 
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Encoding of 16-bit Address Mode with “mod r/m” Byte 


-modr/m | _Effective Address __ __ Effective Address _ 


~ 00000 DS:[BX + SI] DS:[BX + SI+ d16] 
00 001 } DS:[BX + DI] - DS:[BX+ DI + d16] 
00 010 | SS:[BP +-SI] SS:[BP + SI+ d16] 
00 011. | SS:[BP + DI] SS:[BP + Di+ d16] 
00 100 DS:{SI] DS:[SI+ d16] 
00 101 DS:(D1] DS:[DI+ d16] 
00 110 — -_DS:d16 SS:[BP + d16] 

— 00111 © DS:[BX] DS: [BX + d16]. 


01 000 DS:[BX + SI+ d8] 

01001 DS: [BX + DI+ dg] 

01010 SS:[BP + SI+ d8] 

01 011 | SS:[BP + DI+ d8] 

01 100 DS:[SI + d8] 

01 101 _ DS:[DI + d8] 
01110 | _ §S:[BP+d8] 

01 111 DS:[BX + d8] 


register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 


~ Register Specified by r/m 
During 32-Bit Data Operations 


Function of w Field 


Register Specified by r/m 
During 16-Bit Data Operations 


meaelen , Function of w Field 
oF twhenw=0) | (when w =1) 


AX 
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Encoding of 32-bit Address Mode with “mod r/m’” byte (no “s-i-b” byte present): 


Effective Address 


DS:[EAX] 
DS: [ECX] 
DS: [EDX] 
DS: [EBX] 

s-i-b is present 
DS:d32 
DS:[ESI] 
DS:[EDI] 


DS:[EAX + d8] 
DS:[ECX + d8] 
DS:[EDX + d8] 
DS:[EBX + d8] 
s-i-b is present 

SS:[EBP + d8] 
DS:[ESI+ d8] 

DS: [EDI + d8] 


Register Specified byregorr/m | 
during 16-Bit Data Operations: 


Function of w field 
mod r/m 
| (whenw=0) | (whenw=1) 


Effective Address 


DS:[EAX + d32] 
DS:[ECX + d32] 
DS:[EDX + d32] 
DS: [EBX + d32] 
s-i-b is present 
SS:[EBP + d32] 
DS:[ESI+ d32] 
DS:[EDI+ d32] 


register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 


Register Specified by reg or r/m 
during 32-Bit Data Operations: 
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Encoding of 32-bit Address Mode (“mod r/m” byte and “s-i-b” byte present): | 


| ss —s|_ Scale Factor 
00 | x1 | 


Effective Address _ 


DS:[EAX + (scaled index)] 
DS:[ECX + (scaled index)] 
DS: [EDX + (scaled index)] 
DS:[EBX + (scaled index)] 
SS:[ESP + (scaled index)] 

DS:[d32 + (scaled index)] 

DS:[ESI + (scaled index)] 
_DS:[EDI-+ (scaled index)] 


DS:[EAX + (scaled index) + d8] 
DS:[ECX + (scaled index) + d8] 
DS:[EDX + (scaled index) + d8] 
DS:[EBX + (scaled index) + d8] 
SS:[ESP + (scaled index) + d8] 
SS:[EBP + (scaled index) + d8] 
DS:[ESI+ (scaled index) + d8] 
DS:[EDI + (scaled index) + d8] 


__DS:[EAX + (scaled index) + d32] 
DS:[ECX + (scaled index) + d32] 

~ DS:[EDX-+ (scaled index) + d32] — 
DS:[EBX + (scaled index) + d32] 
SS:[ESP + (scaled index) + d32] 
SS:[EBP + (scaled index) + d32] 

DS:[ESI + (scaled index) + d32] 

DS:[EDI + (scaled index) + d32] 


NOTE: | 
Mod field in “mod r/m” byte; ss, index, base fields in 
“*s-i-b” byte. 


|_index | __index Register 


EAX 
ECX 
EDX 


EBX 

no index reg** — 
EBP 
ES! 


EDI 


**IMPORTANT NOTE: 

When index field is 100, indicating “no index register,” then 
ss field MUST equal 00. If index is 100 and ss does not 
equal 00, the effective address is undefined. 
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10.2.3.5 Encoding of Operation 
Direction (d) Field 


In many two-operand instructions the d field is pres- 
ent to indicate which operand is considered the 
source and which is the destination. 


ri Direction of Operation 


Register/Memory <- - Register 
“reg” Field Indicates Source Operand; 
“mod r/m” or ‘mod ss index base”’ Indicates 


Destination Operand 


Register <- - Register/Memory 

“reg” Field Indicates Destination Operand; 
“mod r/m”’ or “mod ss index base”’ Indicates 
Source Operand 


10.2.3.6 Encoding of Sign-Extend (s) Field 


The s field occurs primarily to instructions with im- 
mediate data fields. The s field has an effect only if 
the size of the immediate data is 8 bits and is being 
placed in a 16-bit or 32-bit destination. 


Effect on 
Immediate 
Data 16/32 


Effect on 
Immediate 
Data8 


None 


- Sign-Extend Data8 to Fill 
16-Bit or 32-Bit Destination 


10.2.3.7 Encoding of Conditional 
Test (tttn) Field 


For the conditional instructions (conditional jumps 
and set on condition), tttn is encoded with n indicat- 
ing to use the condition (n=0) or its negation (n= 1), 
and ttt giving the condition to test. 
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Overflow 

No Overflow 

Below/Not Above or Equal 
Not Below/Above or Equal 
Equal/Zero 

Not Equal/Not Zero 

Below or Equal/Not Above 


Not Below or Equal/Above 

Sign 

Not Sign 

Parity/Parity Even 

Not Parity/Parity Odd 

Less Than/Not Greater or Equal 
Not Less Than/Greater or Equal 
Less Than or Equal/Greater Than 
Not Less or Equal/Greater Than 


10.2.3.8 Encoding of Control or Debug 
or Test Register (eee) Field 


For the loading and storing of the Control, Debug 
and Test registers. 


When Interpreted as Control Register Field 


000 CRO 
010 CR2 
011 CR3 


Do not use any other encoding 


When Interpreted as Debug Register Field 


Do not use any other encoding | 7 


When Interpreted as Test Register Field 


Do not use any other encoding 
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10.2.4 ENCODING OF FLOATING POINT 
INSTRUCTION FIELDS : 


Instructions for the FPU assume one of the five 
forms shown in the following table. In all cases, in- 
structions are at least two bytes long and begin with 
the bit pattern 11011B. 


OP = Instruction opcode, possible split into two 
fields OPA and OPB. | 


MF = Memory Format 
00—32-bit real 
01—32-bit integer 
10—64-bit real 
11—16-bit integer 


P = Pop 
O—Do not pop stack 
1—Pop stack after operation 
d = Destination 
0O—Destination is ST(0) 
1—Destination is ST(i) 


R XOR d 
R XOR d 


0—Destination (op) Source 
1—Source (op) Destination 


til 


ST(i) = Register stack element / 
000 = Stack top. 


001 = Second stack element 
e 


: 
. | 
111 = Eighth stack element 


mod (Mode field) and r/m (Register/Memory specifi- 
er) have the same interpretation as the correspond- 
ing fields of the integer instructions. 


s-i-b (Scale Index Base) byte and disp (displace- 
ment) are optionally present in instructions that have 
mod and r/m fields. Their presence depends on the 
values of mod and r/m, as for integer instructions. 


Instruction | 
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11.0 DIFFERENCES BETWEEN THE 
i486™ MICROPROCESSOR AND 
THE 386™ MICROPROCESSOR 
PLUS THE 387™ MATH 
COPROCESSOR EXTENSION 


The differences between the 486 microprocessor 
and the 386 microprocessor are due to performance 
enhancements. The differences between the micro- 
processors are listed below. _ . 


1. Instruction clock counts have been reduced to 
achieve higher performance. See Section 10. 


2. The 486 microprocessor bus is significantly faster 
than the 386 microprocessor bus. Differences in- 
clude a 1X clock, parity support, burst cycles, 
cacheable cycles, cache invalidate cycles and 8- 
bit bus support. The Hardware Interface and Bus 
Operation Sections (Sections 6 and 7) of the data 
sheet should be carefully read to understand the 
486 microprocessor bus functionality. 


3. To support the on-chip cache new bits have been 
added to control register 0 (CD and NW) (Section 
2.1.2.1), new pins have been added to the bus 
(Section 6) and new bus cycle types have been 
added (Section 7). The on-chip cache needs to 
be enabled after reset by clearing the CD and 
NW bit in CRO. | 


4. The complete 387 math coprocessor instruction 
set and register set have been added. No I/O 
cycles are performed during Floating Point in- 
structions. The instruction and data pointers are 
set to 0 after FINIT/FSAVE. Interrupt 9 can no 
longer occur, interrupt 13 occurs instead. 


5. The 486 microprocessor supports new floating 
point error reporting modes.to guarantee DOS 
compatibility. These new modes required a new 
bit in control register O (NE) (Section 2.1.2.1) and 
new pins (FERR# and IGNNE#) (Section 6.2.13 
and 7.2.14). 


6. In some cases FERR # is asserted when the next 
floating point instruction is encountered and in 
other cases it is asserted before the next floating 
point instruction is encountered, depending upon 
the execution state the instruction causing ex- 
ception (see Sections 6.2.13 and 7.2.14). For 
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both of these cases, the 387 Math Coprocessor as- 
serts ERROR# when the error occurs and does not 
wait for the next floating point instruction to be en- 
countered. 


7. Six new instructions have been added: 
Byte Swap (BSWAP) 
Exchange-and-Add (XADD) 

Compare and Exchange (CMPXCHG) 
Invalidate Data Cache (INVD) 


Write-back and _ Invalidate Data 
(WBINVD) 


Invalidate TLB Entry (INVLPG) 


8. There are two new bits defined in control regis- 
ter 3, the page table entries and page directory 
entries (PCD and PWT) (Section 4.5.2.5). 


9. Anew page protection feature has been added. 
This feature required a new bit in control register 
0 (WP) (Section 2.1.2.1 and 4.5.3). 


10. A new Alignment Check feature has been add- 
ed. This feature required a new bit in the flags 
register (AC) (Section 2.1.1.3) and a new bit in 
control register 0 (AM) (Section 2.1.2.1). 


11. The replacement algorithm for the translation 
lookaside buffer has been changed from a ran- 
dom algorithm to a pseudo least recently used 
algorithm like that used by the on-chip cache. 
See Section 5.5 for a description of the algo- 
rithm. 


12. Three new testability registers, TR3, TR4 and 
TR5, have been added for testing the on-chip 
cache. TLB testability has been enhanced. See 
Section 8. 


13. The prefetch queue has been increased from 16 
bytes to 32 bytes. A jump always needs to exe- 
cute after modifying code to guarantee correct 
execution of the new instruction. 


14. After reset, the ID in the upper byte of the DX 
register is 04. The contents of the base regis- 
ters including the floating point registers may be 
different after reset. 


Cache 


12.0 ELECTRICAL DATA 
The following sections describe recommended elec- 


trical connections for the 486 microprocessor, and 
its electrical specifications. 


12.1 Power and Grounding 


12.1.1 POWER CONNECTIONS 


The 486 microprocessor is implemented in CHMOS 
IV technology and has modest power requirements. 
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However, its high clock frequency output buffers can 
Cause power surges as multiple output buffers drive 
new signal levels simultaneously. For clean on-chip 
power distribution at high frequency, 24 Vcc and 28 
Vss pins feed the 486 microprocessor. 


Power and ground connections must be made to all 
external Vcc and GND pins of the 486 microproces- 
sor. On the circuit board, all Vcc pins must be con- 
nected on a Vcc plane. All Vss pins must be like- 
wise connected on a GND plane. 


12.1.2 POWER DECOUPLING 
RECOMMENDATIONS 


Liberal decoupling capacitance should be placed 
near the 486 microprocessor. The 486 microproces- 
sor driving its 32-bit parallel address and data bus- 
ses at high frequencies can cause transient power 
surges, particularly when driving large capacitive 
loads. 


Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
circuit board traces between the 486 microproces- 
sor and decoupling capacitors as much as possible. 
Capacitors specifically for PGA packages are also 
commercially available. 


12.1.3 OTHER CONNECTION 
RECOMMENDATIONS 


N.C. pins should always remain unconnected. 


For reliable operation, always connect unused in- 
puts to an appropriate signal level. Active LOW in- 
puts should be connected to Vcc through a pullup 
resistor. Pullups in the range of 20 KO are recom- 
mended. Active HIGH inputs should be connected to 
GND. 


12.2 Maximum Ratings 


Table 12.1 is a stress rating only, and functional op- 
eration at the maximums is not guaranteed. Function 
operating conditions are given in 12.3 D.C. Specifi- 
cations and 12.4 A.C. Specifications. 


Extended exposure to the Maximum Ratings may af- 
fect device reliability. Furthermore, although the 486 
microprocessor contains protective circuitry to resist 
damage from static electric discharge, always take 
precautions to avoid high static voltages or electric 
fields. 
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Table 12.1. Absolute Maximum Ratings 


Case Temperature under Bias ... —65°C to + 110°C 


Storage Temperature ....... ... 765°C to + 150°C © 
Voltage on Any Pin with 

Respect to Ground......... —0.5 to Voc + 0.5V 
Supply Voltage with 

Respect to Vss ..........-006- —0.5V to +6.5V 


12.3 D.C. Specifications 


Functional Operating Range: Voc = 5V +5%; Tcase = O°C to + 85°C 


Table 12. 2. DC Parametric Values 


-Sympel_|___Parametet_}_ Mie, {_Mox_j_tint | Netes 


Input Low Voltage 

Input High Voltage 

Output Low Voltage © 

Output High Voltage 

Power Supply Current (25 MHz) 


Power Supply Current (33 MHz) we ee» |, 


Input Leakage Current. 
Input Leakage Current 
Input Leakage Current x * 
Output Leakage Current 
Input Capacitance ~ R 


NOTES: 

1. This parameter is measured at: 
Address, Data, BEn 4.0 mA 
Definition, Control 5.0 mA. 
2. This parameter is measured at: 

‘Address, Data, BEn —1.0 mA 

Definition, Control -—-0.9 mA 


~ 3. Typical supply current: 


550 mA @ 25 MHz 
700 mA @ 33 MHz 


(Note 1) 
(Note 2) 
(Note 3) 


(Note 4) 
(Note 5) 
_ (Note 6) 


Fo = 1 MHz (Note 7) 
Fo = 1 MHz (Note 7) 
Fo = 1 MHz (Note 7) 


4. This parameter is for inputs without internal pullups or pulldowns and 0 < Vin < Vcc. 
5. This parameter is for inputs with internal pulldowns and Vi = 2.4V. 
6. This parameter is for inputs with internal pullups and Vi_ = 0.45V. 


7. Not 100% tested. 


12.4 A.C. Specifications 


The A.C. specifications, given in Table 12.3, consist 
of output delays, input setup requirements and input 
hold requirements. All A.C. specifications are rela- 
tive to the rising edge of the CLK signal. 


A.C. specifications measurement is defined by Fig- 
ures 12.1-12.3. Inputs must be driven to the voltage 
levels indicated by Figure 12.3 when A.C. specifica- 


tions are measured. 486 microprocessor output de- 
lays are specified with minimum and maximum limits, 
measured as shown. The minimum 486 microproc- 
essor delay times are hold times provided to exter- 
nal circuitry. 486 microprocessor input setup and 
hold times are specified as minimums, defining the 
smallest acceptable sampling window. Within the 
sampling window, a synchronous input signal must 
be stable for correct 486 microprocessor operation. 
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Table 12.3. 25 MHz i486 Microprocessor A.C. Characteristics 
Voc = 5V +5%; Tease = 0°C to + 85°C; C; = 50 pF unless otherwise specified 


[Parameter | Min 
[Fremeny Cid 
CC ee oe 
[ouxrerossebiy |=‘ 
[oucraitine 
Feukrisetime ——SSidYSSid 


A2-A31, PWT, PCD, BEO-3#, 
M/lO#, D/C#, W/R#, ADS#, 
LOCK #, FERR#, BREQ, HLDA 
Valid Delay 


A2-A31, PWT, PCD, BEO-3#, 
M/lO#, D/C#, W/R#, ADS#, 
LOCK# Float Delay 


PCHK# Valid Delay 
BLAST #, PLOCK# Valid Delay», ” 
BLAST #, PLOCK # Float | 


- 
N 


ont 


re) 


+ 
ok 
oO 


ae | -- 
woh | ah ah 
G |N _ 


ont 
a 
aS 


on al 
— 
Oo 


a2 RDY #, BRDY # Hold Time 
HOLD, AHOLD, BOFF # Setup Time 
HOLD, AHOLD, BOFF # Hold Time 


RESET, FLUSH#, A20M#, NMI, 
INTR, IGNNE# Setup Time 
RESET, FLUSH #, A20M#, NMI, 
INTR, IGNNE# Hold Time 
DO-D31, DPO-3, A4—A31 Read 
Setup Time 
DO-D31, DPO-3, A4—A31 Read 
Hold Time 
NOTE: 


1. Not 100% tested. Guaranteed by design characterization. 


eovally 
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“12.4 
12.5 
12.6 
12.5 


12.6 


12.2 
12.2 
12.2 
12.2 
12.3 
12.3 
12.2 
12.2 
12.2 


12.2 


12.2 


_ 
N 
Nh 


1X Clock Driven to 486 


Adjacent Clocks 
at 2V 
at 0.8V 


Note 1 


Note 1 


Note 1 
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Table 12.3. 33 MHz i486 Microprocessor A.C. Characteristics 
Voc = 5V +5%; Tease = 0°C to Bassai: C; = 50 pF unless otherwise specified 


Symbol | ____Parameter_— | Min | 
| Frequency = =——s«S 8] 88 «| MHz |_|: 1X Clock Driven to 486 
[expense 
[xxi ae at 2V 
CLK Fall Time Ff 3] ons | 
ae 


CLK Rise Time 


A2-A31, PWT, PCD, BEO-3#, 
M/IO#, D/C#, W/R#, ADS#, 

LOCK #, FERR#, BREQ, HLDA 
Valid Delay 


A2-A31, PWT, PCD, BEO-3#, 
M/IO#, D/C#, W/R#, ADS#, 
LOCK # Float Delay 


PCHK# Valid Delay 
BLAST #, PLOCK# Valid Delay 


N 


-- -- 


t x 12.6 | Note 1 
tio DO-D31, DPO-3 Write Data Val al  42.5— 
Delay _ | > > 
t44 DO-D31, DPO-3 Write. ns 12.6 Note 1 
Delay : 


or 
—_ 
a 


12.2 
12.2 
12.2. 
12.2 
12.3 
12.3 
12.2 
12.2 


t13 


=P 
—h 
BAN 


Coed 
—_ 
ii 


os 
es 
ie 
% ee q . 
h, al 
i, SE en é 
é | es eo aS es 
, "ef % , 5 
4 wae be R os 
i. sa 4 
see 7 Sie 
Ca 3% = 
: 2 om 
- aa g sp 
$ % 
ot i, 
LE, i 
Pee 
Si oo 
eo ¥ 
x 
‘ be 


ep Es 
7 
4 es 
ce y : Mais. 
“GES ao 
Be is P 
BS ie ee 2 
oe Z 
ee - 
- 5 
oes 


RESET, FLUSH#, A20M#, NMI, 12.2 
INTR, IGNNE# Setup Time 
RESET, FLUSH#, A20M#, NMI, 12.2 
INTR, IGNNE # Hold Time : 4 
DO-D31, DPO-3, A4-A31 Read 12.2 
Setup Time 7 
om mii. ee 


DO-D31, DPO-3, A4—A31 Read 
Hold Time 
NOTE: 


1. Not 100% tested. Guaranteed by design characterization. 
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Figure 12.2. Input Setup and Hold Timing 
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Figure 12.3. Input Setup and Hoid Timing 
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Figure 12.4. PCHK # Valid Delay Timing 


~- CLK 5 
A2=A31, PWT, PCD, (ts) 


BEO-3#, M/lO#, MAX 


| TMIN | 
D/C#, W/R#, ADS#, [ VALID n KAA VALID nt 


LOCK#, FERR#, BREQ, . 
HLDA 


big MIN. MAX 
- _D0=D31, DPO=3, 


(WRITE) _ VALID n AQ, VALID net 
(20) MIN MAX . 
BLAST#, PLOCK# [ VALID n Wii" VALID n¥1 
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Figure 12.5. Output Valid Delay Timing 
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Figure 12.6. Maximum Float Delay Timing 
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12.4.1 Typical Output Valid Delay versus Load 
Capacitance Under Worst Case 
_ Conditions 


TYPICAL OUTPUT DELAY (ns) 


norp=! NI 
ol 25 50-75 «100 125 150 


wy C, (picofarads) 


240440-75 


NOTE: 
This graph will not be linear outside of the C, range shown. 
nom= nominal value given in A.C. Characteristics table. 


12.4.2 Typical Output Rise Time versus Load 
Capacitance Under Worst-Case 
Conditions 


RISE TIME (ns) 0.8V=2.0V 


75 100 125 150 
C. (picofarads) 
240440-76 


NOTE: 
This graph will not be linear outside of the C, range shown. 


12.5 Designing for ICD-486 
(Advance Information) 


The ICD-486 (In-Circuit Debugger) is a hardware as- 
sisted debugger for the 486 CPU. To use the !CD- 
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486, the 486 CPU component must be removed 
from its socket replaced with the ICD-486 module. 
Because of the high operating frequency of 486 CPU 
systems, there is no buffering of signals between the 
486 CPU in the ICD-486 and the target system. A 
direct result of the non-buffered interconnect is that 
the ICD-486 shares the address and data bus of the 
target system. In order for the ICD-486 to function 
properly (without the Optional Isolation Board in- 
stalled), the design of the target system must meet 
the following restrictions: 


1. The bus controller must only enable data trans- 
ceivers onto the data bus during valid read cycles 
of the 486 CPU, other local devices, or other bus 
masters. 


2. Before another bus master drives the local proc- 
essor address bus, the other bus master must 
gain access to the address bus through the use 
of HOLD-HLDA, AHOLD, or BOFF #. 


In addition to the above restrictions, the |CD-486 has 
several electrical and mechanical characteristics 
that should be taken into consideration when de- 
signing the 486 CPU system. 


Capacitive Loading: ICD-486 adds up to 30 pF to the 
CLK signal, and up to 20 pF to each of the other 486 
CPU signals. 


DC Loading: ICD-486 adds +15 pA loading to the 
CLK and data bus signals and +5 pA loading to the 
address and control signals. | 


Power Requirements: For noise immunity and 
CMOS latch-up protection the ICD-486 is powered 
by the target system through the power and ground 
pins of the 486 CPU socket. The circuitry on the 
ICD-486 draws up to 1.3A excluding the 486 CPU 


Icc: 


No Connects: Pins specified as N.C. in the 486 CPU 
pin description must be left unconnected. Connec- 
tion of any of these pins to power, ground, or any 
other signal may cause the processor or the ICD- 
486 to malfunction. 


486 CPU Location and Orientation: The ICD-486 
may require lateral clearance. Figure 12.4 shows the 
clearance requirements of the ICD-486. 
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Optional Isolation Board (1B) 


Due to its unbuffered design, the |CD- 486 i is suscep- 
tible to errors on the target system’s bus. The OIB 
installs between the ICD-486 and 486 CPU socket in 
the target system and allows the ICD-486 to function 
in systems with faults (i.e., shorted signals). After 
electrical verification the OIB may be removed. The 
OIB has the following electrical and mechanical 
characteristics: 


Buffer Characteristics: The OIB buffers the address 
and data busses as well as the byte enables, ADS#, 
W/R#, M/IO#, BLAST#, and HLDA. The buffers 
are advanced CMOS devices and have the following 
DC drive specifications: lo4 = —-15 mA, IoL = 
64 mA. The propagation delay of each buffer is 5 ns 
max driving a 50 pF load. To guarantee proper oper- 


i486™ MICROPROCESSOR 


ation with the OIB, the clock period should be in- 
creased by the round trip buffer delay (10 ns) unless 
the target system design already has enougn timing 
margin. 


Unbuffered Signals: Signals not listed above as buff- 
ered are passed through the OIB and will have addi- 
tional capacitive loading due to the connectors and 
circuit board of up to 10 pF. 


Power Requirements: The OIB is also powered by 
the target system through the 486 CPU socket and 
requires 0.5A in addition to the ICD-486 and 486 
CPU requirements. 


OIB Clearance Requirements: The OIB requires an 
extra 0.55” of vertical clearance in the target system 
above the 486 CPU socket. 
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Figure 12.4a. ICD-486T™ Probe Dimensions | 
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Figure 12.4b. ICD-486™ Probe Dimensions 
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Processor Module Board Dimensions 


4 
0.85" 
7 240440-77 
240440-44 
Processor Module Assembly Dimensions 
Top View 
Processor 
module 
2.2” 
| 240440-78 
Processor Module Assembly Dimensions 
_ Side View 
pe ee ee 
= | 
—— a ae rm 
ee | Rear 
1.25" 
Processor Module 
240440-79 


Processor Module Assembly Dimensions 
Side View, OIB installed 


Processor Module 


Optional Isolation 
Board 
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13.0 MECHANICAL DATA > 


SEATING 


OQOOQODO OOO 
OOOODO OOOO 
OOOODOOODO®O 


SEATING 
PLANE 


gB (ALL PINS) 


SWAGGED 
PIN 
DETAIL 


45° CHAMFER 
(INDEX CORNER) 
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Family: Ceramic Pin Grid Array Package 


| Min_| Max | Notes | Min | Max | Notes | 
| 3se | 457 | | ot4o | otso | 
| 064 | 114 | soupup | 0.025 | 0.045 | SOLIDLID | 
| 28 | 35 | souiDLuD | 110 | 0140 | souDLID | 
| 114 | 140 | | 0.045 | 0.085 | 
| 0.43 | 051 | | 0.017 | 0.020 _ 

. [1.595 | 1.605_ 

. 
[0.100 | 0.130_ 


44.07 


0.64 

1.14 

ees 

ae 

40.51 | 40.77 | 

| 220 | 270] 

a 
Pecans 
aa 


A 
Ay 
A2 
A3 
Dy 
e4 
L 
| si | 182 | 254 | | 0.060 | 0.100 | 


Figure 13.1. 168 Lead Ceramic PGA Package Dimensions 
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Table 13.1 Ceramic PGA Package Dimension Symbols 


Letter or 

Symbol Description of Dimensions 
Distance from seating plane to highest point of body 
Distance between seating plane and base plane (lid) 


Distance from base plane to highest point of body 


[8 «dL Ciameterofterminalleadpin 
a: ae Largest overall package dimension of length 
[ex | _Linear spacing between true lead poston centerines 


Distance from seating plane to end of lead 
Other body dimension, outer lead center to edge of body 


NOTES: 

1. Controlling dimension: millimeter. 

2. Dimension “e;”’ (“e’’) is non-cumulative. 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-—0.0430 inch. 
4. Dimensions “B”’, “B,” and “C” are nominal. 

5. Details of Pin 1 identifier are optional. 


13.1 Package Thermal Specifications = where Ty, Ta, Tc = Junction, Ambient and Case 
Temperature respectively. 6jc, 9yq = Junction-to- 


The 486 microprocessor is specified for operation Case and Junction-to-Ambient Thermal Resistance, 
when Tc (the case temperature) is within the range respectively. 

of 0°C-85°C. Tc may be measured in any environ- 

ment to determine whether the 486 microprocessor P = Maximum Power Consumption 

is within specified operating range. The case tem- 

perature should be measured at the center of the = The values for @ja and @jc are given in Table 13.2 


top surface opposite the pins. for the 1.75 Sq. in., 168-pin, ceramic PGA. 
The ambient temperature (Ta) is guaranteed as long ‘Table 13.3 shows the T, allowable (without exceed- 
as Tc is not violated. The ambient temperature can ing Tc) at various airflows and operating frequencies 
be calculated from @jc and 0), from the following (fCLk). 
equations. 
7 Note that Ta is greatly improved by attaching “fins” 
Ty =To + P* Ojo or a “heat sink” to the package. P (the maximum 
T,=T + P*@ power consumption) is calculated by using the maxi- 
sale uh mum Icc at 5V as tabulated in the DC Characteris- 
Te = Va Pla Oyel | tics of Section 12. 


Table 13.2. Thermal Resistance (°C/W) 0jc and Oya 


Oya VS Airflow—ft/min (m/sec) | 


JC} 9 | 200 | 400 | 600 | 800 | 1000, 
(0) | (1.04) | (2.03) | (3.04) | (4.06) | (5.07) 


wineatsne” [20] 12] 80 | 60 | so | 45 | 425 


1 
*0.350” high unidirectional heat sink (Al alloy 6063, 40 mil fin width, 155 mil 
center-to-center fin spacing). 
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| Heat Sink Dimensions | 


240440-81 


Ta with 
Heat Sink 
Ta, without 
Heat Sink 
14.0 SUGGESTED SOURCES FOR Heat Sinks/Fins 
, 1486 ACCESSORIES 1. Thermalloy Inc. a) 
Following are some suggested sources of accesso- 2021 West Valley View Lane 
ries for the i486. They are not an endorsement of Dallas, TX 75381-0839 
any kind, nor a warranty of the performance of any Tel: (214) 243-4321 
of the listed products and/or companies. 2. EG & G Division 


60 Audubon Road 
Wakefield, MA 01880 


Sockets Tel: (617) 245-5900 


_ 1.. McKenzie Technology 


44370 Old Palmspring Blvd. . . | TTL stals/ scillators 
Fremont, CA 94538 wy : 
Tel: (415) 651-2700 1. NFL Frequency Controls, Inc. 

2. E-CAM Technology, Inc. 357 Beloit Street 
14455 North Hayden Rd. | Burlington, Wi 53105 
Suite 208 . i a Tel: (414) 763-3591 
Scottsdale, AZ 85260 — } 2. M-Tron 
Tel: (602) 443-1949 | = ; P.O. Box 630 

3. Augat Inc. (for sockets with decaps) _ Yankton, SD 57078 

- Interconnection Products Group Tel: (605) 665-9321 

33 Perry Ave. | | | : | 
P.O. Box 779 : 
Attleboro, MA 02703 — penugang nomer 
Tel: (508) 222-2202 i 1. Emulation Technology 


2344 Walsh Ave., Building F 
Santa Clara, CA 95051 
Tel: (408) 982-0664 
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15.0 REVISION HISTORY 


Revision -003 of the i486 CPU data sheet contains 
many updates and improvements to the original ver- 
sion. A revision summary of major changes is listed 


below: 


The sections significantly revised since version -001 


are: 
Section 2.1.2 


Section 6.2.15 


Section 6.5 


Section 6.5.1 


Section 7.2.10 


The polarity and names of the two 
cache control bits in Control Regis- 
ter 0 (CRO) have been modified. 
The Cache Enable (CE) and Writes 
Transparent (WR) have been re- 
named Cache Disable (CD) and Not 
Write Through (NW). The value of 


CRO after RESET has’ been 
changed to reflect the polarity 
change. 


The discussion of A20M# has been 
Clarified. During the falling edge of 
RESET, A20M# should be high, for 
proper operation of the CPU. 


The value of CRO after RESET has 
been modified. 


Figure 6.3, “Pin State during RE- 
SET” is added. This Figure is a gen- 
eral reference for Reset issues. Pre- 
vious Figures 8.1, 8.2, and 8.8 have 
been deleted, since Figure 6.3 now 
contains Reset information. 


A discussion of addresses and byte 
enables driven during INTA cycles 


_ has been added. 


Section 10.1 


Section 10.1 


Section 12.2 


Section 12.3 


Section 12.3, 


Section 12.3 


Clock counts and opcodes have 
been clarified and corrected. 


The opcode slot for CMPXCHG in- 
struction has been moved from 
OFA6/A7 to OFBO/B1. 


Table 12.1 has been enhanced. The 
“Case Temperature under Bias” 
spec was improved. The “Supply 
Voltage with Respect to Vss”’ spec 
was added. 


Maximum Icc values have been im- 
proved to 700 mA at 25 MHz and 
900 mA at 33 MHz. 


Typical Icc values have been modi- 
fied to 550 mA at 25 MHz and 700 
mA at 33 MHz. 


Cin» Co, and Co_kK values have 
been changed to 20 pF. Testing pa- 
rameters and Note 7 were added. 


Section 12.4 


Section 12.5 
Section 13.1 


Section 13.1 


The A.C. Specifications have been 
improved. Float delays were im- 
proved at both 25 MHz and 33 MHz. 
Note 1 was added to the float de- 
lays. Maximum valid delays were re- 
duced at 33 MHz. 


The ICD section was enhanced. 


Thermal resistance @ca values of 
the 168-pin ceramic package have 
been corrected. 


Maximum ambient temperatures 
have been corrected to use the max 
Icc values. 


The sections significantly revised since version -002 


are: 
2.1.2.1 
Table 2.16 


Section 3.1 


Section 3.5 


Section 4.4.6 
Section 4.5.4 
Section 5.6 


Section 5.7 


Section 6.2.5 
Section 6.2.8 
Section 6.2.12 
Section 6.2.13 
Section 6.2.14 


Section 6.2.15 


Section 6.3 


Section 6.3.1 
Section 6.3.2 


Section 6.5 


Section 6.5 
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Spec change for PCD and PWT bits. 
Value of Intel Reserved Interrupt 
Vector assignment corrected to ‘18- 
31’. 

Added CMPCHG, XADD_ instruc- 
tions in the table. 

Added explanation about NMI not 
able to bring out the processor from 
shutdown under certain conditions. 
Value of task switching time correct- 
ed to 10 ms. 

Specification change for PCD and 
PWT bits. 

Specification change for PCD and 
PWT bits. 

Cache flushing procedure ex- 
plained, when FLUSH# applied 
synchronously or asynchronously. 
Specification change for PLOCK cy- 
cle. 

Added explanation for warm boot- 
up. 

Specification change for PCD and 
PWT bits. 

Explanation added for FERR# be- 
havior. 

Explanation added for IGNNE# be- 
havior. 

Explanation added for A20M# be- 
havior in protected mode and during 
RESET. 


Simplified example for read reorder- 


ing in write buffers. 
Corrected REP OUTS instruction. 


Added explanation about cache up- 
date on read-modify-write cycle. 
Added RESET pulse length require- 
ment with or without BIST 

Added table for i486 revision ID. 


| intel 


Table 6.2 
Figure 6.3 


Section 
7.2.2.3 


Section 
7.2.3.4 


Figure 7.12 
Figure 7.13 
Figure 7.14 


Section 
7.2.4.2 
Section 7.2.6 


Section 7.2.7 


Section 7.2.8 
Section 7.2.8 
Figure 7.22 
Figure 7.23 


Figure 7.25 


| changed to 15 CLKs. 
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Corrected CRO value after Reset. 


- Corrected pin state diagram during 


RESET. RESET  puise length 


Added explanation to terminate 
burst cycle. 


Clarified text on changing KEN# 
during cache line fill. 

Corrected timing diagram to show 
A4-A31, M/IO#, D/C#, W/R# do 
not change during burst. 

Corrected timing diagram to show 
A4-A31, M/IO#, D/C#, W/R# do 
not change during burst. 


Corrected timing diagram to show 


A4-A31, M/IO#, D/C#, W/R# do 


not change during burst. 


Added cases that follow burst order. 


Added explanation for read-modify- 
write for un-aligned transfers. 
HOLD latency decreased by provid- 
ing window in PLOCK cycle (specifi- 
cation change). 


Added explanation about EADS # 
timing. 


Added the case of invalidation with 
BOFF or HOLD. 


Change in Timing Diagram’ for 
BREQ. | 

Change in Timing Diagram for 
BREQ. | 

Change in Timing Diagram for 
RDY #/BRDY#. 


Section 7.2.9 


Section 7.2.11 


Section 7.2.11 


Figure 7.30 


Section 7.2.14 


Section 8.1 
Section 8.4 


Table 10.1 
Section 11.0 


Section 11.0 


Section 12.3 
Section 12.3 


Figure 12.2 & 


Figure 12.3 - 


Section 13.1 


Section 14.0 
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Difference _ in 
‘ERROR # explained. 


Added explanation about HOLD 
getting recognized during  un- 
aligned writes. 

Added status of address and data 
busses during special bus cycles. 
Added sections on Halt and Shut- 
down cycles.. 


Corrected state diagram by ANDing 
BRDY# and BLAST# for the last 
transfer of the burst cycle. 


FERR # and 
Changed Reset width to 15 CLKs. 


Added explanation on_ tri-state 
status. 


Corrected value in format. 
Added Note 6 on FERR# and 
ERROR ¥ difference. 


Added TLB replacement algorithm 
for 386 DX. 


Corrected values in Note 2. 


Added “internal” for pullup and pull- 


down resistors. 
Waveforms for input and output sig- 


“nals have been re-drawn to show 
details about set-up, hold and float 
. times. 


Added details about TA calculation 
from 8jc and Oya. 


Added new section on suggested 
sources of i486 accessories like 
sockets, debugging tower, heat 
sinks, etc. 
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 485TURBOCACHE MODULE 
i486™ MICROPROCESSOR CACHE UPGRADE 
82485MA (64k Module) 
82485MB (128k Module) 
m High Performance m High Integration 
— Zero Waitstate Access — Seven Square Inch Area 
— One Clock Bursting — Includes Tag, Data, Parity, and 
— Two-Way Set Associative Controller 
— BIOS ROM Cacheing 
‘ m Easy To Use 
— 25/33 MHz Operation — Software Transparent 
m Range Of Price/Performance — End User/Dealer Installation 
— 0, 64k, 128k Cache With Single — Write-Through Memory Update 
Socket — Same Timing as i486™ CPU 
— Cascadable With Multiple Sockets — Same Invalidation Mechanism as 
i486 CPU 


The 485Turbocache Module is a performance upgrade for 25 MHz or 33 MHz i486T Microprocessor systems. 
It provides up to 128k bytes of external cache memory in a single, end-user installable module. Support for the 
cache module upgrade is provided by a 113 pin socket in the i486 CPU system. A single socket allows three 
price/performance configurations: no cache, a 64k byte cache, or 128k byte cache. Additional modules may 
be cascaded for larger cache sizes. No jumpers, configuration software, or BIOS/applications/operating sys- 
tem support is required to get 5-30% (15% average) performance boost after installing the cache. Cache data 
integrity is monitored by a parity bit per byte. 
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Figure 0.1. 485Turbocache Module Internal Block Diagram 
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Figure 0.2. 485Turbocache Module 64k/128k Pin Configuration 
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0.2 PIN DESCRIPTION OVERVIEW : 


[Pinwame | Type | Actve | ~*~ erption 


CONTROL SIGNALS 


CLK - CLOCK is the timing reference from which the 485Turbocache 
Module monitors and generates events. CLK must be connected 
to the i486 CPU CLK pin. 


RESET High RESET CACHE forces the 485Turbocache Module to begin 
execution in a known state and must be connected to the i486 
CPU RESET pin. It also causes all cache lines to be invalidated. 
Setup and hold times to3 and to4 must be met for recognition in 
any specific clock. 


ADS # Low ADDRESS STROBE is generated by the i486 Microprocessor. It 
is used to determine that a new cycle has been started. Setup 
time tz must be met for proper operation. 


MEMORY/IO is an i486 CPU generated cycle definition signal 
that indicates a Memory (M/IO # high) or I/O (M/iO # low) 
access. Setup time t7 must be met for proper operation. 


M/lO# 


W/R# WRITE/READ is an i486 CPU generated cycle definition signal 
used to indicate a Write (W/R # high) or Read (W/R # low) 


access. Setup time t7 must be met for proper operation. 


START # Low MEMORY START indicates that a cache read miss or a write has 
occurred and that the current access must be serviced by the 
memory system. START # is not activated for I/O cycles, and is 


not asserted if CS # is inactive. 


Low BURST READY OUT is a burst ready signal driven by the 
485Turbocache Module to the i486 CPU. It is activated when a 
read hit occurs to the 485Turbocache Module and. should be a 
term in the BRDY # input to the i486 CPU. 


CBRDY # Low CACHE BURST READY IN is the burst ready input from the 
memory system. It is applied to both the 485Turbocache Module 
and the i486 CPU BRDY # pin in parallel. CBRDY # is ignored 


BRDYO# 


during T1 and idle cycles. BLAST # determines the length of the 
transfer. All cacheable read cycles are 4 dword transfers. Setup 
and hold times tg and t;g must be met for proper operation. 


CRDY # Low CACHE READY IN is the non-burst ready input from the system. 
Like CBRDY #, itis applied to both the cache and i486 CPU 
RDY # pin in parallel. CRDY # is ignored during T1 and idle 
cycles. Setup and hold times tg and t}q must be met for proper 


operation. 


BURST LAST is output by the i486 CPU and is sampled by the 
485Turbocache Module to determine when the end of a cycle 

occurs. Setup and hold times tg and tga must be met for proper 
operation. 


BACKOFF is an i486 CPU input sampled by the 485Turbocache 
Module to indicate that a cycle be immediately terminated. If 
BOFF # is sampled active, the 485Turbocache Module will float 
its data bus. The 485Turbocache Module will ignore all cycles, 
except invalidation cycles, until BOFF # is deactivated. Setup 
and hold times t;7 and t1g must be met for proper operation. 


BLAST # Low 


BOFF # Low 


PRSN # Low PRESENCE is an active low output always asserted by the 
485Turbocache Module. It may be used as a 485Turbocache 
Module presence indicator and should be connected via a 10K 


pullup resistor. 
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0.2 PIN DESCRIPTION OVERVIEW (Continued) 


| ADDRESSSIGNALS i —eses—<“—sSSsSsSSi i SIGNALS 


A2-A31 PROCESSOR ADDRESS LINES A2-A31 are the i486 CPU 
address lines used by the 485Turbocache Module. Address lines 
A2 and A3 are used as burst address bits. In the 64k . 
485Turbocache Module, A4-A14 comprise the set address inputs 
to the 485Turbocache Module and A15-A31 are used as the tag 
address. In the 128k 485Turbocache Module, A4 becomes a line 
select input, A5-A15 is the set address input and A16-A31 is 
used as the tag address. Setup time ts must be met for proper 
operation. 

BEO#-BE3# | BYTE ENABLE inputs are connected to the i486 CPU byte 

| enable outputs. They are specifically used for completing partial 

? | reads from and writes to the 485Turbocache Module during hit 
cycles. During miss cycles, transfers are ignored if all the byte 
enables are not asserted since the 485Turbocache Module only 
caches 32-bit transfers. Setup time tg must be met for proper 

| operation. 

CS# » Low CHIP SELECT is used to cascade 485Turbocache Module 

modules. Address bits may be decoded in order to cascade ~ 

| multiple devices or be decoded to selectively cache portions of 
memory. Setup and hold times tgp and t31 must be met for proper — 
operation. 

DATA SIGNALS | | : 7 

DO0-D31 I/O PROCESSOR DATA LINES DO-D31 are connected to the i486 
CPU data bus. D0O-D7 define the least significant byte while D24- 
D31 define the most significant byte. Setup and hold times ty3__- 
and t;4 must be met for proper operation. 

DPO-DP3 1/0 DATA PARITY are the parity bits associated with the data on the 
data bus. They are connected to the i486 CPU pins with the 
same name. Parity is treated by the 485Turbocache Module as | 
additional data bits to be stored. Setup and hold times t;3 and ty4 

must be met for proper operation. 


CACHEABILITY SIGNALS 


CACHE ENABLE TO CPU is the KEN # term generated by the 
485Turbocache Module to the i486 Microprocessor. CKEN # is 
activated twice; First during T1 to enable a cache line fill, and 
second on the clock before the last BRDY # or RDY # to validate 
the line fill. CKEN # is ALWAYS active in T1, but will not validate 
a line fill if the line fill is a write protected line and WPSTRP # is 
low, or if the cycle is a read miss. 


SYSTEM CACHE ENABLE is an input from the main memory 
system to indicate whether the current line fill is cacheable in the 
485Turbocache Module. It is sampled by the 485Turbocache 
Module exactly like KEN # is sampled by the i486 
Microprocessor. Setup and hold times ty, and ty must be met 
for proper operation. 


7 “TT 


FLUSH # 


- FLUSH CACHE causes the 485Turbocache Module to invalidate 
its entire cache contents regardless of CS #. Any line fill in 

~ progress will continue, but will be invalidated immediately. The 
i486 CPU flush instruction does not affect the 485Turbocache 

_ Module. Setup and hold times to3 and to4 must be met for 

recognition in any specific clock. 
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485Turbocache Module 


0.2 PIN DESCRIPTION OVERVIEW (Continued) 


| PinName | Type | Active 


CACHEABILITY SIGNALS (Continued) 


Pp 


| 


Description 


WRITE PROTECT defines a line as write protected. WP is 
sampled during the third transfer of a line fill and is maintained 
internally as a state bit. Any subsequent writes to a write 
protected line will have no effect. Setup and hold times t5 and 
t1g6 must be met for proper operation. 


WRITE PROTECT STRAPPING OPTION changes the behavior 
of CKEN #. CKEN # is always asserted in T1 to indicate a 
cacheable line transfer but is deasserted on the next clock. 
During read hit cycles, CKEN # is asserted again for the duration 
of the transfer to indicate a cacheable line fill. If WPSTRP # is 
strapped low, and a write protected line is being transferred, 
CKEN # is not activated again for the transfer. This prevents the 
i486 CPU from cacheing write protected lines during read hit 
cycles. WPSTRP# must be valid and not change two clocks 
before and after the falling edge of RESET. 


WPSTRP # 


INVALIDATE SIGNALS 


EADS # Low VALID EXTERNAL ADDRESS STROBE indicates that an 
invalidation address is present on the i486 CPU address bus. 

The 485Turbocache Module will invalidate this address, if 

present, but will only do so if CS# is active. The 485Turbocache 
Module is capable of accepting an EADS # every other clock. 

The 485Turbocache Module EADS# should be connected to the © 
i486 CPU EADS # pin. Setup and hold times t;9 and tap must be 
met for proper operation. | 


1.0 FUNCTIONAL DESCRIPTION 


1.1 Introduction 


The 485Turbocache Module is a complete 2-way 
set-associative 64k or 128k cache housed in a 
113-pin module. It contains 4 or 8 custom data 
SRAMs and the Intel 82485 cache controller. The 
cache module was designed to be cascadablie to a 
maximum of 512k with the addition of more mod- 
ules. The module was also designed so the system 
may easily detect a cache’s presence and reconfig- 
ure itself accordingly. The 485Turbocache Module is 
a plug-in option that is an ideal i486™ Microproces- 
sor cache solution. 


The cache module interfaces directly to the i486 Mi- 
croprocessor. Designing with the cache module is 
easy because it directly supports the timing of 
25 MHz and 33 MHz systems. It is capable of read- 
_ing and writing data in O waitstates, and performing 


1 clock bursting. Because the 485Turbocache Mod- 
ule was designed exclusively for the i486 Microproc- 


essor, it recognizes i486 CPU invalidations, use of 


BOFF#, and prematurely terminated cycles. The 
cache module is write-through so it supports the 
same i486 CPU consistency mechanisms, stores 
data parity, can cache BIOS in modes where the 
i486 CPU cannot, is software transparent, and may 
be an end-user installable upgrade. 


Below are the order codes for the 485Turbocache 


Module: 
64k 82485MA-25 82485MA-33 
128k - 


82485MB-25 82485MB-33 
The following Functional Description describes the 
cache module’s base architecture, its operation, fea- 
tures, and deviations from the i486 CPU specifica- 
tion. 
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1.2 Base Architecture 


The 485Turbocache Module contains an 82485 
cache controller and 4 (82485MA) or 8 (82485MB) 
SRAMs for a complete 64k or 128k cache. In either 
configuration, the 485Turbocache Module is 2-way 
set-associative with a 16 byte line size. 


Figure 1.1 outlines the 82485 cache controller which 
is the heart of the 485Turbocache Module. Each 
WAY contains 2k tags with 17 bits per tag so it may 
store the complete 4G real address space. The tags 
also reference 2 valid bits and a write-protect bit. 
When the 82485 is configured as a 64k cache, as.in 


the 64k 485Turbocache Module, each tag refer- — 
ences a single, 16 byte line. When the 82485 is con- 


figured as a 128k cache, as in the 128k 
485Turbocache Module, each tag is forced to refer- 
ence two consecutive 16 byte lines; This is called 
sectoring. A 128k 485Turbocache Module contains 
2 sectors per tag. The LS input (address bit A4) de- 
termines which sector of each tag is being selected. 


The control units of the 82485 are responsible for 
three main functions: controlling the data SRAMs, 
controlling the tagram structure, and interfacing to 
the i486 CPU. Since these are independent units, 


the 82485 is capable of updating its tagram while | 
data is being bursted into SRAM, or invalidating dur- 


ing a line fill to a different address. Special address 


registers in the 485Turbocache Module allow the - 


i486 Microprocessor to drop its address in the first 
T2 (in response to AHOLD) and the system to issue 
a invalidate address with an i486 CPU hold time. 


The 82485 uses the “Least Recently Used” algo- 
rithm to determine which tag should be invalidated 


TAG & STATUS RAM 


= 


485Turbocache Module 


LRU WAY 2 ¢ t 


CONTROL LOGIC 


Figure 1.1. 82485 Cache Controller 


PRELIMINARY 


on cache misses. A single LRU bit per tag is used to 


point to the tag that will be replaced’ should a re- 
placement be required. 


The data memory portion of the 485Turbocache 
Module is composed of a set of SRAMs that operate 
at fast 33 MHz speeds. They are capable of 0 wait- 
state reads and writes, and single clock bursting, 
and have minimized capacitive loading on the i486 
CPU clock and data lines. 


1.3 Cache Operation 


To operate at high speeds, the 485Turbocache 
Module must begin its tag lookup to determine a 
cache hit or miss as soon as possible. During normal 
operation, this is done as soon as the i486 CPU gen- 
erates an address. SRAM reads, SRAM writes, and 
system signals cannot be generated until a hit or 
miss has been determined. The following sections 
will discuss read miss, read hit, write, invalidate, and 
BOFF # cycles. 


1.3.1 READ MISS 


_ Figure 1.2 shows 485Turbocache Module activity 


during a normal read miss cycles. In T1, the 


-485Turbocache Module begins its tag lookup to see 
if the read cycle is a hit. Once it has been deter- 


mined that the address is not present in the cache (a 
miss), START # is issued to indicate to the memory 
system that it must service the current cycle. The 
cache is then idle until SKEN#, the cache’s KEN# 
input, is seen active. Should SKEN # be inactive and 
the burst line transfer from memory begin, the line is 
non-cacheable and is ignored. . 


SECTOR 1 SECTOR 2 


(128K ONLY) 


TO SRAMS 


240722-4 
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. ™ | 
i486 CPU) Tt 72, 12. 72, 12, T2, 
ACTIVITY 4 


BEGIN ISSUE INITIATE TERMINATE 
MODULE TAG START# LINE LINEFILL 
ACTIVITY LOOKUP | SKEN WAIT | FILL 


Figure 1.2. Normal Read Miss Cycle 
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Once SKEN# _  has_ been _— asserted, _ the line of data in 5 clocks. The 485Turbocache Module 
485Turbocache Module invalidates a line in the asserts CKEN# (its KEN# output to the i486 CPU) 
cache (or chooses a free line) in preparation for the in both T1 and the third T2 to indicate this as a 


bursted data (see section 4.3.1). The data is bursted cacheable transfer. Should the bursted line be write- 
into the cache and back to the i486 Microprocessor protected, AND WPSTRP # is strapped low, CKEN # 
simultaneously. If an SKEN# preceded the last is high for the third T2 and the line is not cached by 
bursted item, then the line was cacheable, and the the i486 CPU. The only updating the 485Turbocache 


485Turbocache Module updates its valid bit to indi- Module needs to perform during read hit cycles is to 
cate so. If the line is invalid, or aborted for any rea- update the LRU bit to point to the WAY that was not 
son (BLAST #, BOFF #) the line is left invalid. transferred. 


During a read miss cycle, the 485Turbocache Mod- 
ule cannot accept the data from memory in zero 1.3.3 WRITE CYCLES 
waitstates. The earliest data may be returned is the 
clock after START # is sampled active. START # is 
the signal that indicates that the memory system 
must complete the current cycle. 


Since the 485Turbocache Module is a write-through 
cache, all write cycles are written by the i486 CPU to 
main memory. Figure 1.3 shows a write hit where the 
tag lookup in T1 is found to be a hit so the data is 
updated by the cache in T2. Write misses do not 
affect cache contents, nor do writes to write protect- 
ed lines. Write hits will alter the LRU bit in the same 
way as a read hit. 


The 485Turbocache Module is also capable of han- 
dling non-burst and interrupted burst line fills. Refer 
to the section “4.0. Performance Considerations” 
for improving 485Turbocache Module performance 
during line fills. Note that the 485Turbocache 
Module only caches 32-bit transfers. The 1.3.4 INVALIDATION CYCLES 
485Turbocache Module does not input the i486 CPU 7 
inputs BS#8 or BS#16. All transfers are assumed = The 485Turbocache Module allows invalidation cy- 
to be 32-bit transfers with valid data on all 32 data. cles to occur at any time by asserting AHOLD and 
lines. — EADS #. Self-invalidations, where AHOLD is not as- 
serted, are allowed at any time except on the clock 
edge of the last transfer of a line fill. EADS # asser- 
1.3.2 READ HIT tion allows both the CPU cache and 485Turbocache 
Module to be invalidated at the same time. Regard- 


During Read Hit cycles, the 485Turbocache Module less of what the 485Turbocache Module is doing, 


responds directly to the i486 Microprocessor with a 


i486 'Mcpyu 
ACTIVITY 


BEGIN UPDATE 
MODULE TAG DATA 


ACTIVITY LOOKUP 
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Figure 1.3. Write Hit Cycle 
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EADS # causes the address present on the address 
inputs of the 485Turbocache Module to be invalidat- 
ed. This includes read hit, read miss, write, and 
BOFF # cycles. 


There may be a performance penalty, however, if 
EADS # is asserted at a time when the tag memory 
of the 485Turbocache Module is in use. Since the 
485Turbocache Module tags are single-ported, only 
one tag access per clock is allowed. 


Figure 1.4 shows a read miss cycle with an invalida- 
tion lookup occurring in the third transfer of a line fill. 
Under normal conditions, the 485Turbocache Mod- 
ule would, on the next clock, validate the current line 
that is being filled. Since the EADS# occurred, the 
~ tagram is occupied on the next clock with a tag look- 
up to see if the invalidate is a hit, and the current line 
is not yet validated. If it is a hit, the next cycle is used 
to perform the actual invalidation. The following 
clock is spent validating the current line fill. Should 
the i486 Microprocessor begin a cycle immediately, 
the 485Turbocache Module is not able to perform its 
tag lookup until one clock cycle later when the tag 
memory is free. This causes START# to be de- 
layed, and mamatey a memory read cycle from be- 
ginning. 


For greatest performance, EADS# should not be is- 
sued in the second, third or fourth transfer of a 
cache line fill. 


Self-invalidations, EADS # asserted without AHOLD, 
are not allowed at the clock edge of the last T2 ofa 
cycle (the first T1 clock edge of the next cycle). Ifa 
self-invalidation occurs in T1, ADS# and EADS# 
are sampled at the same time, the 485Turbocache 
Module will invalidate the line and assert START # 
as ina normal read miss cycle. If EADS # is asserted 
at any other time, START # is not asserted. 


1.3.5 BOFF# CYCLES 


When BOFF # is asserted, the 485Turbocache Mod- 
ule, like the i486 Microprocessor, will relinquish the 
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bus in the next clock cycle. While BOFF # is assert- 
ed, as any other time, the 485Turbocache Module 
monitors EADS # to perform any invalidate cycles. 


lf BOFF# is asserted during a cache read hit 
(data is being transferred from cache to CPU), the 
485Turbocache Module invalidates the line being 
transferred. Once BOFF# has been released and 
the cycle resumes, the 485Turbocache Module sees 
this as a cache miss and the memory system must 


supply the remaining data. If BOFF# is asserted 


during a cache read miss (memory is transferring to 
cache and CPU), the 485Turbocache Module will — 
treat the line fill like an aborted fill, and the line will 
remain invalid. Once BOFF # is released and the cy- 
cle is restarted, the remainder of the line fill is treat- 
ed like another aborted fill, and remains invalid. 


Figure 1.5 is an example of an aborted line fill. Since 
the line transfer is interrupted before the transfer 
completes, it stays invalidated. Once the transfer re- 
sumes, the 485Turbocache Module sees a new cy- 
cle begin with ADS #, but it completes with BLAST # 
after three transfers. It treats this as an aborted line 
fill cycle, and the cycle is never validated. 


Asserting BOFF# in the same clock as ADS¥# will 
cause the i486 CPU to float its bus in the next clock 
and leave ADS# floating low. Since ADS# is float- 
ing low, a peripheral device may think that a new bus 
cycle has begun even though the cycle was aborted. 
The 82485 handles this circumstance in most cases 
since an active ADS# in the clock BOFF # is deas- 
serted is ignored. The only circumstance that must 
be handled by the system is as follows: 


BOFF# is asserted in T1, ond HaiarS BOFF # is 


deasserted, HOLD is asserted and remains asserted 


after BOFF# is deasserted (see Figure 1.6). In this. 
circumstance it is necessary for the system to as- 
sure that ADS# is either driven to a valid level or 
pulled high in the clock after BOFF # is deasserted 
(meeting the 82485 ADS# setup time). 
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Figure 1.4. invalidation During Read Miss 
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Figure 1.5. Aborted Line Fill 
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Figure 1.6. BOFF # Asserted in T1 


There are several ways to avoid this system restric- 
tion: 


1. Do not assert BOFF # in T1. - 


2. Use a “two clock’ backoff: in the first clock 
AHOLD is asserted and in the second clock 
BOFF # is asserted. This guarantees that ADS # 
will not be floating low. 


3. Do not assert HOLD when BOFF # is asserted. 


1.4 Incompatibilities 


Below are a list of some special design considera- 
tions that the 485Turbocache Module requires to be 
designed into an i486 CPU system. They have been 
summarized to point out any possible inconsisten- 
cies between the i486 CPU specification and the 
485Turbocache Module specification: 


1. Invalidation cycles may only be performed every 
two clocks. Unlike the i486 CPU, the 
485Turbocache Module only allows EADS# as- 
sertion every other clock at most. 


2. The minimum clock high voltage is slightly higher 
than the i486 CPU specification. It is still within 
TTL levels, however. 


3. The i486 CPU will recognize HOLD during non- 
burst, non-cacheable, code prefetches. These 
prefetches are cacheable by the 485Turbocache 
Module. Since the module does not see the HLDA 
signal, another bus master could hold the CPU in 
mid-cycle, begin its own transfer, and coinciden- 
tally. complete the cacheable transfer. This is only 
possible in systems that have another bus master 
that can drive the module’s ADS pin. In these sys- 
tems, the CPU’s HLDA pin should be inverted and 
connected to the module’s BOFF ¥ input. This 
guarantees that cycles interrupted by HLDA will 
be aborted, and not cached, by _ the 
485Turbocache Module. 


2.0 SYSTEM INTERFACE 


The following section describes the basic connec- 
tion of the 485Turbocache Module in an i486 CPU 
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Figure 2.1. 485Turbocache Module Typical Configuration 


| system. The section highlights the CPU bus connec- 
tions, memory bus connections, and gives specifics 
about their related signals. 


A typical 485Turbocache Module connection to an 
i486 Microprocessor and memory subsystem is 
shown in Figure 2.1. All of the signals that the i486 
CPU generate ‘“‘feed-around” the 485Turbocache 
Module; That is, they go to both the 485Turbocache 
Module and the memory controller. In turn, most 
memory generated signals feed-around — the 


485Turbocache Module back to the CPU. This is © 


what makes the 485Turbocache Module an optional 
cache. The following describes all the Sales the 
485Turbocache Module encounters. 


2.1 i486™ Microprocessor Signals 


The following 485Turbocache Module signals con- 
nect directly to the corresponding i486 CPU signals. 
These pins have the same name and functionality as 
the i486 Microprocessor pins. 


2.1.1 ADDRESS LINES A2-A31 


‘A2-A31 are the address lines generated by the i486 
CPU and used by the cache as set and tag address- 


es. A 64k 485Turbocache Module cache will use A4- _ 


A14 as set address inputs and the remaining ad- 
dress bits as tag address. A 128k 485Turbocache 
Module uses A5-A15 as set address inputs, A4 as 
the line select bit for sectoring, and the remaining 
bits as tag address. Address lines A2 and A3 are 
used as burst address inputs. 


The address lines are also used as invalidate inputs. 
At any time, if EADS# is asserted, the address that 
is present at the address inputs will be invalidated. 
The 485Turbocache Module will not invalidate un- 


less CS# is sampled active. Note that the address is 
latched internally so that AHOLD assertion in 1 is 
permitted. 


2.1.2 DATA LINES DO-D31 AND PARITY 
-DPO-DP3 


This is the processor dai bus common to the i486 
CPU, the 485Turbocache Module, and memory bus. 
The 485Turbocache Module transfers information to 
the CPU on read hits, and stores data from memory 
on read misses. The four parity bits, DPO-DP3 are 

treated just like extra data bits. | 


2.1.3 ADS#, W/R#, M/IO# 


The processor control signals ADS#, W/R#, and 
M/lO# are used by the 485Turbocache Module to 
indicate the start of a new cycle, and identify the 
type of cycle. ADS# assertion indicates a T1 cycle 
and initiates the tag lookup process in the 
485Turbocache Module. |/O cycles are ignored. 


ADS# is the primary signal that activates the 
485Turbocache Module. When ADS # goes low, the 
module begins the hit/miss tag lookup regardless of 
the state of Chip Select (CS #). For this reason, any 
bus master that controls the ADS# input to the 
module must meet the module address bus setup 
and hold times, regardless of the state of CS #. Chip 
Select, when inactive, disables the module outputs 
only. (Note that CS# must be asserted for invalida- 
tion cycles.) 


2.1.4 BYTE ENABLES BE0#-BE3+# 


Byte enable inputs are used to complete partial byte 
or word writes to the 485Turbocache Module on- 
cache write hit cycles. All other partial transfers are 
ignored by the 485Turbocache Module. 
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2.1.5 BLAST # 


BLAST # is used by the 485Turbocache Module to 
indicate the end of a cycle. If BLAST# is asserted 
early during a cache line fill from a read miss, that 
transfer is left invalid by the 485Turbocache Module. 


2.1.6 BOFF # 


Once BOFF# is sampled by the 485Turbocache 
_ Module, it relinquishes control of the data bus in the 
next clock. If a read hit line transfer was in progress, 
that transfer will not continue once BOFF# is re- 
leased. If a read miss transfer was interrupted by 
BOFF#, the 485Turbocache Module would mark 
the line as invalid even if the transfer continues once 
BOFF# has been released. The 485Turbocache 
Module will recognize invalidations during BOFF #, 
but will only do so if CS# is active. 


2.1.7 FLUSH # 


The 485Turbocache Module FLUSH# input be- 
haves exactly like the i486 Microprocessor input. 
Once asserted, FLUSH# will invalidate the entire 
_contents of its cache memory regardless of the 
state of CS#. While FLUSH# is asserted, the 
485Turbocache Module continues to track CPU bus 
cycles and treats all accesses as cache misses, ac- 
tivating START # appropriately. 


FLUSH# may be used asynchronously with both the 
i486 CPU and the 485Turbocache Module. If the 
proper pulsewidths are given, FLUSH # will be rec- 
ognized, but, it is possible that the FLUSH# will be 
recognized on different clock edges for each device. 
This may happen if FLUSH# assertion or deasser- 
tion is near its setup and hold times when one de- 
vice may recognize it and the other may not. 


2.1.8 EADS #, AHOLD 


EADS # assertion causes the 485Turbocache Mod- 
ule to invalidate the address present on the address 
bus if CS# is seen active. AHOLD need not be as- 
serted, nor is it even used as an input to the 
485Turbocache Module. EADS# may be asserted 
at most once.every other clock as that is the fastest 
485 Turbocache Module invalidation rate. The sec- 
tion titled “invalidation cycles” describes where 
EADS# may be asserted for maximum perform- 
ance. 


EADS# may not be asserted on the clock edge of 
the last T2 of a cycle (the first T1 of the next cycle) if 
AHOLD is not asserted. | 
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2.1.9 RESET 


RESET is an asynchronous input that causes the 
485Turbocache Module to reset its internal ma- 
chines to a known state: its entire cache contents . 
invalidated, and expecting the start of a new bus 
cycle. RESET must be high for at least 10 clocks for 
the 485Turbocache Module to reset properly from a 
warm boot. For a cold boot, RESET must remain 
active for 3000 ns (100 clocks at 33 MHz, 75 clocks 
at 25 MHz). There must be no bus activity for at least 
4 clocks after the falling edge of RESET so the 
485Turbocache Module can reset internally. The 
falling edge of RESET causes the 485Turbocache 
Module to sample its WPSTRP# strapping option. 


2.2 CPU Bus Interface Signals 


These are signals generated by the 485Turbocache 
Module, or decoded from the i486 CPU that corre- 
spond to the CPU bus. 


2.2.1 CHIP SELECT CS# 


Chip Select is used to select the proper 
485Turbocache Module cache module if multiple 
modules are used, otherwise, with one 
485Turbocache Module, CS# may be grounded. 
CS # is generated by decoding the lowest order tag 
addresses coming into the module. For example, 
two 128k cache modules would decode A16 for their 
chip selects. A16 high would select module 1, while 
A16 low would select module 2. The following table 
summarizes the addresses used for decoding: 


een Address Bit(s) to Decode 


A15 
A15, A16 
A16 
A16, A17 


For compatibility, A16 and A17 may be decoded for 
64k modules. Performance may be increased be- 
cause of increased granularity, however, if A15 and 
A16 are used. 


With CS# inactive, invalidation cycles are ignored, 
START# is inactive, and CKEN¥# is_ inactive. 
CKEN # does, however, always activate in T1 as itis 
not possible for the 485Turbocache Module to rec- 
ognize CS# before then. 


If required, the LOCK# signal may be used as a 
term in the creation of CS#. If locked cycles do not 
generate CS#, START # must be generated exter- 
nally so memory may handle the cycle. 
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2.3.2 START # 


2.2.2 CPU CACHE ENABLE CKEN# ; 


_ CKEN# is generated by the 485Turbocache Module . 


to indicate that its current transfer, during a read hit 
_ cycle, is cacheable. It is always driven (not an open- 
collector output) and must be used as one of the 


terms that generates KEN# to the i486 Microproc- - 


essor. CKEN # is always active in T1, but then goes 
inactive and remains inactive unless the cycle is a 
read hit cycle. | : e 3 


For read miss and write cycles, CKEN# goes inac- 
tive in T2 and remains inactive until the next T1. It is 
the responsibility of the system to generate the 
KEN# signal to the i486 CPU in these cases. 


In a read hit cycle, CKEN# goes active again in the 
second T2 and remains active throughout the cycle. 
This forces external KEN# logic to activate KEN # 
and make the cycle cacheable to the i486 CPU. 
However, if the line being transferred is write-pro- 
tected, AND the WPSTRP# pin is strapped low, 
CKEN# stays inactive in T2 and remains inactive 
throughout the cycle. This allows write protected 
lines in the 485Turbocache Module to be cacheable 
only to the 485Turbocache Module. 


2.2.3 BURST READY OUT BRDYO# 


The 485Turbocache Module generates BRDYO# 
when it is bursting data back to the i486 CPU during 
read hit cycles. BRDYO# is always driven (not an 
open collector output) and should be used by exter- 
nal logic to create the BRDY# input signal to the 
i486 CPU. Since the 485Turbocache Module is a 
zero waitstate, single clock burst cache, BRDYO # is 
activated in the first T2 until the fourth T2 unless the 
cycle is interrupted. 


4 


2.3 Memory Interface Signals 


Memory Interface Signals are signals coming to or 
from the main memory subsystem. The only signal 
the 485Turbocache Module generates to the memo- 
ry system is START #, which is the only signal that 
must be handled should the 485Turbocache Module 
be designed as an option. 


2.3.1 PRSN# 


This signal is tied low inside the 485Turbocache 
Module. If the system pulls this signal high with a 
10K pullup resistor, cache presence will be indicated 
by that line being pulled low. PRSN# signal is used 
to indicate that external logic should only start mem- 
ory cycles when START# goes active rather than 
from ADS# active. 


START # is a signal asserted by the 485Turbocache 


- Module to indicate that the memory subsystem must 


process the current cycle. START # is always driven 
and valid and is asserted for all read miss cycles and 
memory write cycles. START # is not activated for 
I/O cycles, or if CS# is sampled inactive. START # 
is normally active in the first T2, but may be delayed 
if an invalidation cycle forced the previous cycle to 
be elongated (see 1.3.4 Invalidation cycles). 


2.3.3 WRITE PROTECT WP 


‘The Write Protect input is an active high input that 


indicates to the 485Turbocache Module that the cur- 
rent line transfer is write-protected. It is sampled on 
the clock edge of the third BRDY # of a line transfer 
of a read-miss cycle. The 485Turbocache Module 
saves this information as a single bit in each tag 
location. In 128k configurations where there is a sin- 
gle tag for 2 consecutive lines, the write protect bit is 
valid for both lines. If a location has been write-pro- 


‘tected, and writes to that location will be ignored. 


WP is a synchronous input and must meet the 
485Turbocache Module setup and hold times re- 
gardiess of whether it is being sampled or not. 


2.3.4 WRITE PROTECT STRAPPING OPTION 
WPSTRP # 


WPSTRP # is a strapping option that is sampled dur- 
ing RESET. It indicates whether write protected 
items in the 485Turbocache Module should be 
cacheable in the i486 CPU cache. If WPSTRP-# is 
high , CKEN# will go active in T2 during all read hit 
cycles to indicate that they are cacheable. If 
WPSTRP# is low, CKEN# will be inactive in T2 for 
read hit cycles to locations that are write-protected. 
This allows write protected items to be cached by 
the 485Turbocache Module and not by the i486 
CPU. 


2.3.5 SYSTEM CACHE ENABLE SKEN# 


The SKEN# input to the 485Turbocache Module is © 
like the KEN # input to the i486 Microprocessor. It is 
sampled just like KEN#, the clock before the first 
and last transfers of a line fill, to indicate whether the 
line is cacheable. If the KEN # input to the i486 CPU 
is connected to the SKEN# input of the 
485Turbocache Module, the i486 CPU _ internal 
cache and the 485Turbocache Module will cache 


_the same items. It is possible to control KEN# and 


SKEN# separately so the 485Turbocache Module 
and i486 CPU cache different areas of memory. 
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SKEN# is a synchronous input and must meet the 
485Turbocache Module setup and hold times re- 
gardless of whether it is being sampled or not. 


2.3.6 CACHE READY AND BURST READY 
CRDY #, CBRDY # 


CRDY# and CBRDY# are the ready and burst 
ready inputs to the 485Turbocache Module. They 
should behave exactly like the i486 CPU RDY # and 
BRDY# inputs. CBRDY# should be used in con- 
junction with BRDYO# to generate the i486 CPU 
BRDY # input. Likewise, CRDY # should be used to 
form the i486 CPU RDY # input. 


The 485Turbocache Module does not sample the 
CBRDY # or CRDY # inputs during read hits, so it is 
not possible to artificially add waitstates to the 
485Turbocache Module’s burst transfer. The 
CBRDY# and CRDY# _ inputs’ must, follow 
485Turbocache Module setup and hold times even 
outside the sampling window. 


3.0 SYSTEM CONFIGURATIONS 


Two of the most important features of the 
485Turbocache Module are its cascadability and its 
optionality. Below, it is explained how to design a 
system with a single 485Turbocache Module, multi- 
ple 485Turbocache Modules and a socket for an op- 
tional 485Turbocache Module. 


3.1 Single Cache 


In a single cache configuration, the addition of a 
485Turbocache Module requires no or little extra 
logic. Most of the signals are common to the i486 
CPU, the memory bus_ controller, and _ the 
485Turbocache Module. The others, such as KEN#, 
SKEN#, and START # will be discussed individually. 


3.1.1 i486™ MICROPROCESSOR BUS 
INTERFACE 


As seen in Figure 2.1, the i486 CPU-related signals 
are connected to both the 485Turbocache Module 
and the memory controller. These are the address 
bus, data and parity bus, ADS#, W/R#, M/IO#, 
BEO#-BE3#, BLAST #, RESET, and CLK. 


Since a single 485Turbocache Module resides on 
the address bus, CS # may be tied low so the part is 
always chip selected. 


485Turbocache Module 


PRELIMINARY 


3.1.2 MEMORY BUS INTERFACE 


On the memory bus side, BOFF#, FLUSH#, and 
EADS# are connected to the i486 CPU and the 
485Turbocache Module in parallel. The memory 
ready signals, CRDY # and CBRDY #, are connect- 
ed directly to the 485Turbocache Module, but are 
combined with other system ready signals to form 
the i486 CPU RDY# and BRDY # inputs. One of the 
system ready signals is the 485Turbocache Module 
BRDYO# which must be ANDed with CBRDY # and 
other burst ready signals to form BRDY# into the 
CPU. 


The memory system must also generate the WP in- 
put. If write-protection is not needed, WP may be 
tied to Vss. If the system would like to prevent write- 
protected lines in the 485Turbocache Module from 
being cached by the i486, WPSTRP# should be tied 
to Vss. 


3.1.3 KEN# AND SKEN# GENERATION 


The KEN # input to the i486 Microprocessor is a re- 
sult of all the cache enable signals in the system. 
Since the 485Turbocache Module activates CKEN # 
only during a read hit cycle, the CKEN# output may 
be ANDed with the system cache enable signal to 
form KEN # to the i486 CPU. 


If the 485Turbocache Module and i486 CPU internal 
cache will cache the same areas of memory, the 
KEN# input to the i486 CPU may be tied to the 
SKEN # input of the 485Turbocache Module. Other- 
wise, the memory system can generate 2 cache en- 
able signals: One that is ANDed with CKEN # to pro- 
duce KEN #, and another for the SKEN # input. 


3.1.4 START# GENERATION 


START # goes low to indicate that the memory sys- 
tem must complete the current cycle. This is true for 
all memory writes and read misses. It is the memory 
subsystem’s responsibility to recognize I/O cycles - 
and begin an I/O access without waiting for 
START #. 


START # is asserted in T2, but may be delayed if 
there was an invalidation in the previous cycle (see 
1.3.4 Invalidation cycles). Because the assertion of 
START # may be somewhat unpredictable, it is rec- 
ommended that START # be used to either begin a 
DRAM RAS cycle, or enable DRAM output buffers. 
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Figure 3.1a shows that START # may be the indica- 
tion to DRAM control to begin a cycle. Once 
START # is sampled active, a RAS and CAS cycle 


begin. This will incur an extra waitstate to cache 


-read misses since the earliest a memory cycle will 
begin is the first T2. ; 


Figure 3.1b shows that START # may pecabie DRAM 
data buffers. The actual DRAM cycle begins once 
ADS# and M/IO# are sampled low, but will not 
complete until the buffers have been gated allowing 
data to be written to the i486 CPU data bus. Should 
the cycle be a 485Turbocache Module read hit, the 
buffers are never enabled. Since the 


485Turbocache Module takes 5 .clock cycles to — 


complete the burst transfer, RAS precharge time 
can easily be absorbed. 


See 4.3.4 START# Predictability for detailed infor- 
mation how START # may be asserted in a predict- 
able manner. 


3.2 Multiple Cache 


A multiple cache scheme is similar to the single 
cache scheme because all of the i486 Microproces- 
sor bus interface signal connection remain the 
same. Like the single cache example, only KEN#, 
SKEN#, START#, and now CS#, need special 
handling. Figure 3.2 is an example of a 512k multiple 
cache configuration. 


3.2.1 MEMORY BUS INTERFACE 


Like the i486 Microprocessor bus interface signals, 
BOFF#, FLUSH#, and EADS# are connected to 
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a. BEGIN CYCLE: 


485Turbocache Module 


PRELIMINARY 


the CPU, memory system, and all caches in parallel. 
The ready and burst ready outputs from.the memory 
system connect to. the CRDY # and CBRDY # inputs 
to all 485Turbocache Module caches. The CBRDY # 
signal is then ANDed with the BRDYO# outputs 
from all 485Turbocache Modules to form BRDY # to 
the i486 CPU. 


32.2 § START # 


START # is activated by a single 485Turbocache 
Module at a time because CS # is active for a single 
485Turbocache Module at a time. START #, there- 
fore, may be ANDed with all other START # signals 
to form a system start indication. See section 3.1 
Single Cache for details how START # may be used. 


3.2.3 KEN # 


Like START #, CKEN # is only activated for chip se- 
lected modules. Therefore, all CKEN# outputs may 
be ANDed together to form the i486 CPU KEN # sig- 


‘nal. A system cache enable signal must also be in- 


cluded in the AND terms since it is the system’s 


responsibility to generate KEN# during read miss 


cycles. 


3.2.4 SKEN # 


Since SKEN# is used during read miss eyeles and 
ignored otherwise, the system cache enable signal 
can be connected to all 485Turbocache Modules’ 
SKEN# inputs. If multiple sources can create the 
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Figure 3.1. Using START # in DRAM Control 
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KEN# signal to the i486 CPU, KEN# may be fed 
into all 485Turbocache Modules. If the i486 CPU 
caches different memory locations than the second- 
level cache, SKEN# must be generated separately 
and then connected to all 485Turbocache Module 
inputs. . 


3.2.5 CS# 


Chip select is used to identify which 485Turbocache 
Module is being addressed. It is the result of decod- 
ing the lowest order tag address bits. Figure 3.2 
shows how a PLD chooses one of _ four 
485Turbocache Modules. Anytime an address is 
present on the address bus, including invalidation 
cycles, one of the 485Turbocache Modules is se- 
lected. 


3.3 Optional Cache 


The 485Turbocache Module is an optional cache. 
However, its most powerful feature is allowing a 
system to reconfigure itself easily once a 
485Turbocache Module has been installed. To ac- 
complish this, the 485Turbocache Module is de- 
signed as a write-through cache with all signals 
feeding around to the memory subsystem whether 


CKEN# START# 
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the 485Turbocache Module is present or not. There 
are only a few considerations that need to be made 
to allow the 485Turbocache Module to be fully op- 
tional. 


3.3.1 SIGNAL CONSIDERATIONS: START #, 
CKEN#, BRDYO# | 


If the 485Turbocache Module is not present in a sys- 
tem that expects it to be, the START# signal will 
never be asserted and memory will never begin a 
cycle. A solution to this problem is to connect the 
PRSN# presence pin into the memory controller 
that accepts START#. If PRSN# is high, the 
485Turbocache Module is not present, and all mem- 
ory cycles should begin with the assertion of ADS #. 
Note that START # should have a pullup resistor to 
ensure it is not left floating. 


When the 485Turbocache Module is removed from 
a system the CKEN# and BRDYO# signals, which 
are combined with external logic to form KEN # and 
BRDY #, will be left floating. All CKEN#, BRDYO#, 
START #, and PRSN# pins should have pullup re- 
sistors tied to them. This assures an inactive state 
when no 485Turbocache Module is present. 
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Figure 3.2 Multiple Cache Configuration 
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3.3.2 CONSIDERATIONS WITH MULTIPLE 
| CACHES 


As long as all the START #, CKEN#, BRDYO#, and 
PRSN# signals have pullup resistors tied to them, 
all empty cache sockets will respond like inactive 
caches. There is, however, a chip selecting problem 
since CS# decoding varies with the number of 
caches that are present. 


Chip select decoding logic, like Figure 3.3 shows, 
should have all PRSN# pins as input. From this in- 
formation, the correct chip select decoding can be 
generated. The logic in Figure 3.3 is able to keep 
CS1 asserted if one cache is detected, decode A16 
if 2 caches are detected, or decode A16 and A17 if 
all 4 caches are present. . 


The most difficult problem to overcome when allow- 
ing an optional number of multiple caches is to ac- 
count for capacitive load changes. Since each 
cache has a capacitive load on the data bus and 
clock lines, some amount of design effort must be 
spent resolving capacitive loading. When designing 
with 4 caches, each cache will probably have to re- 
ceive a dedicated clock line. As well, the data bus 
will have to be buffered outside of the CPU and 
cache core. 


4.0 OPERATIONAL/PERFORMANCE 
‘CONSIDERATIONS 


The following sections provide more detailed infor- 
mation about operating and designing-in the 
485Turbocache Module. This includes testing the 


cache, understanding sectoring, and making small | 


performance adjustments. 


4.1 Testing and Data Integrity 


The 485Turbocache Module can monitor data integ- 
rity using parity bits. The i486 Microprocessor has 
the capability of outputting and checking data parity. 
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The. memory subsystem must also support parity to 
use the parity support on the 485Turbocache Mod- 
ule. This data parity information is stored with every 
byte inside the 485Turbocache Module, and is 
checked by the i486 CPU during data reads. To be 
able to identify data errors from memory or cache, 


_ the parity error check output (PCHK#) of the i486 


CPU can be sampled. 


Power up self test programs test main memory func- 
tionality on a cell by cell basis since parity logic is 
not capable of detecting all memory failures. It is 
also important to test cache memory. The following 
algorithm will test any number of 64k 
485Turbocache Modules or 128k nicpaiatlaschet a 
Modules up to 512k of cache memory: 


1. Flush or Reset the cache. 
2. Write “1” to every bit of a 512k block of memory. 
3. Read the 512k block twice; this fills the cache. 


4. Disable CS# and write oe to the 512k block; this 
fills memory. 


5. Read the 512k block: 


e Repetitive assertions of START# indicate the 
cache boundary (size of cache) 


e Data ~ 1 indicates bad tag or SRAM 
6. Repeat with “0” in the cache and “1” in memory. 


\ 


4.2 Sectored vs Non-Sectored Cache 


The 64k 485Turbocache Module was designed as a 
64k non-sectored cache; this means each tag of the 
cache points to 1 line of data in the cache memory. 
A 128k cache requires twice the number of tags to 
be non-sectored. This increases tag size, complexi- 
ty, and reduces tag lookup speed. For this reason, 
the 128k 485Turbocache Module is a sectored 
cache. Each tag in the 128k 485Turbocache Module 
points to 2 consecutive lines in the cache. A Line 
Select bit, address bit A4, determines which line is 
being referenced. 


DECODE 
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Figure 3.3. Chip Select Decoding 
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Figure 4.1 is an example of one tag in a sectored 
cache. If this tag points to address 2500h, then the 
adjacent line is reserved for address 2510h (A4 
high). If, for example, address 2510 had been written 
first, the tag would still contain 25 and only address 
2500 could be placed in the first line. 


Since the Line Select Bit is used for a sectored ar- 
chitecture, all set and tag address bits are shifted 
higher in the address space. The 128k 
485Turbocache Module internally compensates for 
this shift so  pin-compatibility with the 64k 
485Turbocache Module is maintained. This allows 
either cache configuration, 64k 485Turbocache 
Module or 128k 485Turbocache Module, to be hard- 
ware-transparent. 


Because a sectored cache references 2 consecutive 
lines, the odds of filling both lines is reduced, and 
thus the hit rate of the cache. A sectored cache will 
have a slightly reduced hit rate compared to an 
equivalent non-sectored cache, but simulations 
have shown the performance penalties to be mini- 
mai (1 to 2 percent). Simulations have also shown 
that a two-way set associative sectored 128k cache 
Offers significantly better performance than a direct 
mapped 128k non-sectored cache. 


4.3 Performance Considerations 


The following section offers a few special considera- 
tions that will increase cache performance or ease 
hardware design. These considerations are simply 
design notes and are not deviations from the i486 
Microprocessor specification. 


4.3.1 SKEN# ASSERTION 


SKEN # is an input to the 485Turbocache Module to 
indicate the cacheability of a line during a read miss 
cycle. It is sampled exactly like KEN# in the i486 
CPU, one clock before the first dword transfer of a 
line fill, and one clock before the last dword. 


During a line fill, the 485Turbocache Module loads 
the dwords of the line directly into the appropriate 
spot in cache memory. This means that once 


Tag Address 2500 
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SKEN# has been sampled active by _ the 
A485Turbocache Module, it must “commit” a line and 
invalidate a location to prepare for the incoming line. 
Once a line fill completes with a proper SKEN#, the 
line can be validated. 


A potential performance loss exists if a system de- 
signer chooses, during non-cacheable cycles, to 
keep SKEN# active, but inactivate SKEN# the 
clock before the first transfer (see Figure 4.2). Once 
the 485Turbocache Module sees SKEN # low in the 
first T2, it commits a line in the cache by invalidating 
an entry despite the fact that SKEN# was later 
deasserted. The performance loss can be avoided if 
SKEN# was held inactive until cacheability could be 
determined. 


4.3.2 INVALIDATION WINDOW 


When an invalidation is requested with the assertion 
of EADS #, the 485Turbocache Module must imme- 
diately invalidate the address present on the ad- 
dress bus. If the tag portion of the 485Turbocache 
Module is in use, the invalidation takes priority and 
will suspend the other action. This may decrease 
performance. To avoid this, EADS# should not be 
issued in the second, third or fourth transfer of a 
cache read miss cycle. Section 1.3.4 Invalidation Cy- 
cles under Functional Description explains this in de- 
tail. 


4.3.3 BOFF # ASSERTION 


lf BOFF # is asserted and the 485Turbocache Mod- 
ule is in the middle of a cacheable read miss cycle, 
the 485Turbocache Module treats the current line fill 
as non-cacheable. Once BOFF# is released and 
the cycle continues, the 485Turbocache Module will 
treat the rest of the cycle as a non-cacheable cycle. 


In most systems BOFF # is a rare occurrence, thus 
the performance loss is negligible. If, however, 
BOFF # is regular and predictable, system perform- 
ance can be increased by timing BOFF # so that the 
four dword transfers of a line fill are never interrupt- 
ed. Section 1.3.5 BOFF# Cycles under Functional 
Description explains aborted cycles in more detail. 


Address 2510 


ser of] Lom | 


Sector 1 


240722-13 


Figure 4.1 Sectored Example 
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Figure 4.2 Method of SKEN# Generation Not Recommended 


4.3.4 START# PREDICTABILITY 


START # is asserted in the first T2 of a read miss 
_cycle unless an invalidation occurred in the previous 
cycle. The section titled ‘“‘Invalidation Cycles” ex- 
plains why START# may be delayed. If START # 
must be a predictable signal to the system, and in- 
validation cycles cannot be timed to occur before 
the second transfer of a read miss cycle, there is a 
way to ensure the predictability of START #. 


When EADS # is asserted towards the end of a read 
miss cycle, there are 3 tag accesses that need to be 
made before T1 of the next cycle: invalidate lookup, 
the actual invalidation (if a hit), and validation of the 
current line fill (if cacheable). Since there is no way 
to predict the hit/miss possibility of an invalidation 
request, it is assumed that 2 tag accesses will be 
required to service it. One tag access can be saved, 
then, by making the current line fill non-cacheable. 


To do this, SKEN# to the 485Turbocache Module 
may be deasserted if AHOLD is detected. If SKEN # 
is deasserted the clock before the last CBRDY#, 
the line is non-cacheable. Figure 4.3a shows how 


assertion of EADS# during the third transfer of a 
burst cycle incurs a.1 clock delay in START #. Fig- 
ure 4.3b shows EADS# assertion in the fourth 
clock, but since AHOLD will cause the CPU to delay 
ADS# at least one extra clock, START # is delayed 
only 1 clock as well. Assertion of EADS # in the sec- 
ond transfer of a burst causes a 1 clock delay in 
START# without deasserting SKEN# (see Figure 


_ 4.3c), so there is no advantage in dropping SKEN# 


for EADS # assertion then. — 


In summary, if SKEN# is deasserted in response to 
AHOLD during the third of fourth transfer of a line fill, 
START# will be delayed at most 1 clock. This 
makes START # predictable: It will always be valid in 
the second T2 of a read miss cycle. Note that if 
START # was not delayed, its value is retained in 


the second T2. 
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nto! 485Turbocache Module 


. ™ 
i486 ' "CPU T2 T2 
ACTIVITY 
INVALIDATE | INVALIDATE | BEGIN ISSUE 
MODULE LOOKUP TAG TAG START# 
ACTIVITY HIT/MISS LOOKUP* 


EADS# OCCURS HERE 
240722-15 


2, | 12 
4 
INVALIDATE | INVALIDATE | BEGIN ISSUE 
LOOKUP TAG TAG START# 
HIT/MISS LOOKUP 


EADS# OCCURS HERE 
240722-16 


1486 'Mcpy 
ACTIVITY 


MODULE 
ACTIVITY 


b. 


T2 T2 T1 T2 T2 
ACTIVITY 3 4 
INVALIDATE | INVALIDATE | VALIDATE | BEGIN TAG | ISSUE 
MODULE LOOKUP TAG LINE FILL }| LOOKUP START# 
ACTIVITY HIT/MISS 


EADS# OCCURS HERE 
240722-17 


AgcTWITY om |, 


NOTES: 
*Tag Validation is not done since SKEN# was deasserted. 


**T; occurs because AHOLD assertion and deassertion causes ADS # to be delayed 
Figure 4.3 Predictable START # Delay 
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5.0 MECHANICAL SPECIFICATIONS 


0.09" £0.01” 


OO-'0710-0 07 
C000 O70 
OO O50:O.0. 
OC: 0: O00 


0.020" 
Diameter 
Round 


240722-19 


pre 
1.90” 
240722-18 


Pin Side View 


Figures Not Drawn To Scale 
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6.0 ABSOLUTE MAXIMUM RATINGS* 


Ambient Temperature under Bias ....0°C to + 70°C 


Storage Temperature .......... —55°C to + 150°C 
Voltage on Any Pin 

with Respect to Ground ...—0.5V to Vcc + 0.5V 
Power Dissipation: 

64k 485Turbocache Module ................ 4W 

128k 485Turbocache Module............... 6W 
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NOTICE: This data sheet contains preliminary infor- 


mation on new products in production. The specifica- 
tions are subject to change without notice. 


*WARNING: Stressing the device beyond the “Absolute 
Maximum Ratings” may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


7.0 D.C. CHARACTERISTICS (Vcc = 5V +5%) 


lu Input Leakage Current: 
DO-—D31 and Parity 
DO-D31 and Parity 
CLK 
CLK 
TAI15, TAI16, WPSTRR #9 
All Other Inputs . 


—DO0 Through D31 
—D0 Through D31 


Clock Input Capacitance 
Clock input Capacitance 
Co 1/O Capacitance 


NOTES: 
1. Measured at 4.5 mA. 
2. Measured at 1.0 mA. 


82485MA 
82485MB 
82485MA 
82485MB 


82485MA 
82485MB 


82485MA 
82485MB 
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8.0 A.C. CHARACTERISTICS (Vcc = 5V £5%) 


~ AIL A.C timings are tested with a capacitive load of 50 pF unless otherwise specified. _ 


Parameter 
ee fi Min(ns) 


foukreadSSCS~Cia 
fouktigh tine SSC*dC 
fotkLowTine Sid 
Cc a 
an 
—_ 


CLK Rise Time 


Nowe Snoop P 
BLAST # Setup 
CRDY*#, aes Setup € 
CRDY#, CBRDY# Hold | , 
SKEN # Setup 
SKEN# Hold 
DO-D31, DPO-DP3 Setup 
DO-D31, DPO-DP3 al 


Ned Ned © | O}O)0O);0 10 | oO - 
ne) he) No | @®ju}ujyufuju 
- 4 
© 
a |. 


ep 


| ‘10 


Co al 
eS 
np [= 


ob 
oh 
BAN 


e - oP 
oh . 
oO ;. 


-- 
= 
me) 
~” 
@O 
= 
Cc 
x?) 


15 
16 WP Hold ss 
BOFF # Setup 


ont 


{ttt 


17 


BOFF # noid: 


~-- 
nk 
[ee] 


RESET, FLUSH # Pulse Width 
(Asynchronous Use) 


ie) 


oi 
| © | © co 
BR] A Ww 
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8.0 A.C. CHARACTERISTICS (Vcc = 5V +5%) (Continued) 


All A.C timings are tested with a aaa Galea load of 50 pF unless otherwise specified. 


BRDYO# Valid 
CKEN# Valid LOrrg 
CKEN # Hold j <u) POS 
START # Valid or Re 
oo ~ . 


tes 
| tsi | 


, 
2. At 0.8V. 

3. Setup to CLK edge of third BRDY # in line fill. 

4. Setup to CLK edge where EADS # is vaiid. 

5. Hold time from CLK edge in which CKEN# will be sampled. 
6. Valid up to C_ = 100 pF. 

7. At the clock edge in. which ADS# or EADS# is sampled. 
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9.0 WAVEFORMS 


240722-20 
tx = input setup times 
ty = input hold times, output float, valid and hold times 


Figure 9.1. CLK Waveforms 


Tt | | , | 12 : 
CLK 

ee ttt Pe 
aos 
= fbb =o 
DATA GGa8 toad an ann oop 


CBRDY# 


BLAST# 
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Figure 9.2. Write Protected Read Miss 
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CLK 


Cy 


ADS# 


START# 


Af 
alee 
fare 
Le 


DATA 


. 
. 
x 
§ 
. 
6 
3 
. 
x 


BRDYO# 


CKEN# 
(1) 


CKEN# 
(2) 


AHOLD 


EADS# 


ADDR 


Mit AL le: 


CS# 


dll ee 3 

vain| Wane 
EAS 
Ube? 


“a 


BLAST# 
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1. Normal CKEN # behavior. 
2. CKEN# behavior if line is Write Protected and WPSTRP # is low. 


FLUSH# 
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Figure 9.4. RESET and FLUSH # 
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CLK 


a 
ia 


START# 


mn 
ca 
~ reo 


DATA 
ee tee 
CRDY# 


_— CS# 
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Figure 9.5. Multiple Cycle Line Fill 


Read Hit with invalidation 7 : Delayed START # Assertion — 
T2 T2 T2 T1 Tl 12 72 T2 


CLK 
- START# 


~ FEST hiccleoaloooleos 
PIPPI VIIA 


DATA 


CBRDY# ma 


BLAST# 


eens 


CS# 
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Figure 9.6. Invalidation Causing Delayed START # 
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10.0 REVISION HISTORY Section 2.1.9 Corrected RESET specifications. 


Section 5.0 
Revision -002 of the 485Turbocache Module Data 


Sheet contains several updates and corrections to Section 6.0 
the original version. A revision summary of major ec on 
changes is listed below: Section 7.0 


Throughout The name of the cache module has 
Document been changed from Turbocache 486 
Module to 485Turbocache Module. Section 8.0 


Section 1.3.1 Clarified that all transfers seen by the 
485Turbocache Module are assumed 
to be 32-bit transfers. 


Section 1.4 Removed one incompatibility between 
the 485Turbocache and the i486 
CPU. 
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Made mechanical specifications more 
precise. 


Modified absolute maximum ratings. 
Modified Viy and Vo. specifications. 


Added input and output leakage cur- 
rent specifications. 


Corrected AC specifications tz and 
tos. Added AC specifications tg, and 
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82485 
SECOND LEVEL CACHE CONTROLLER 
FOR 1 i486™M MICROPROCESSOR 


m High Performance - @ High Integration 
— Zero Wait State Access on Cache Hit — Single Chip Tag RAM and Controller 


— One Clock Bursting 
— Two-Way Set Associative 
— Write Protect Attribute Per Tag 


— No Logic Needed for CPU and Cache 
Connection 


_— Maps Full 4 Gigabyte Address Space 


— Start Memory Cycles in Parallel m Flexible System Configurations 


m Easy to Use — Supports 64K or 128K Cache 
— Matches i486™ Microprocessor Bus Memory Per Controller 
Timing | — Allows Multiple Controllers for 
— Supports invalidation Cycles Larger Cache Size 
— Maintains Memory on Writes — Supports Non-Cacheable Memory 
' Areas 


The 82485 is a second-level cache controller designed to improve the performance of i486™M Microprocessor 
systems. One 82485 cache controller supports 64K or 128K bytes of second level cache memory that maps to 
the entire 4 Gigabytes of the i486 microprocessor address space. The controller is completely software 
transparent. Several controllers may be cascaded to provide larger cache sizes. One controller plus SRAMs 
provides a 64K or a 128K cache. External EPROM can be cached yet remain write protected. The 82485 is 
fully compatible with the i486 microprocessor. All i486 CPU bus cycles and timings are supported. 


A complete, optional second level cache controller using the 82485 is available as the 485 Turbocache Module 
from Intel (data sheet order number 240722). : 


i486 is a trademark of Intel Corporation. 


82485 Internal Block Diagram 


SET ADDRESS 
OUT 


-SNOOP ADDRESS 
APES REGISTER _ REGISTER 


SYSTEM SYSTEM 
SIGNALS INTERFACE 


PROCESSOR 
PROCESSOR INTERFACE 
SIGNALS 
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TAGRAM CONTROL 
AND TIMING 


SRAM CONTROL 
AND TIMING SRAM 
SIGNALS 


For the complete data sheet on this device, contact Intel’s Literature Distribution Dept., (800) 548-4725. 240831 —-1 
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1.0 INTRODUCTION 


The 1486™ CPU contains several improvements over 
its predecessor, the highly successful 386™ CPU. One 
of the most important of these is the processor’s data 
access rate. The i486 CPU can access instructions and 
data from its on-chip cache in the same clock cycle. To 
support the processor’s redesigned internal data path, 
the external bus has also been optimized and can access 
external memory at twice the rate of the 386 CPU. The 
internal cache requires rapid access to entire cache 
lines. Invalidation cycles must be supported to maintain 
consistency with external memory. All of these func- 
tions must be supported by the external memory sys- 
tem. Without them, the full performance potential of 
the CPU cannot be attained. 


The requirements of todays multitasking and multipro- 
cessor operating systems also put increased demand on 
_ the external memory system. OS support functions 
such as paging and context switching can degrade refer- 
ence locality. Without efficient access to external mem- 
ory, the performance of these functions is reduced. 


Second level caching is a technique used to improve the 


memory interface. Some applications, such as multiuser — 


office computers, require this feature to meet perform- 
ance goals. Single-user systems, on the other hand, may 
not warrant the extra cost. Given the variety of applica- 
tions incorporating the 1486 CPU, memory system ar- 
chitecture will be very diverse. 


In this application note, we will work with an example 
to discuss the details of memory system design. In the 
example, we have supported as many functions of the 
CPU as possible. An optional second-level cache is in- 
cluded. A write buffer is also implemented to reduce 
write latency. The cache supports zero wait state read 
cycles. The DRAM controller supports the following 
devices with the wait states shown in Table 2. The 
DRAM speed given in Table 1 is the RAS access time 
(tRAC). Table 2 summarizes the bus clocks required 
for each function. | 


Table 1 


CPU Clock Freq. 


DRAM 
Speed 


25 MHz 100 ns . 
33 MHz 70 ns | 


Many of the functions and optimizations included here 
will not be required in every application. The example 
provides guidelines for the hardware designer but will 
not necessarily provide the optimal cost/performance 
solution for many applications. For example, 11 PLDs 
are required to implement the memory control logic 
partially due to the implementation of a back-off capa- 
bility. An address register must also be used to imple- 
ment this function. If this function is not used, the con- 
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trol logic can be substantially reduced. These and other 
optimizations will be discussed in the summary. 


Table 2 


Subsequent 
Burst 
Accesses 


DRAM 
Function 


NOTE: . 
*Write miss latencies occur only during cycles subsequent 
to a write miss cycle. 


The discussion assumes a working knowledge of com- 
puter system design. Items discussed but not explained 
include DRAM operation, PLD programming and op- 
eration, worst-case timing analysis and i486 CPU bus 
operation. The complete schematics and PLD equa- 
tions are in Appendix A. 


2.0 THE 485TURBOCACHE SECOND 
LEVEL CACHE MODULE 


Several different types of second level cache architec- 
tures are possible candidates for use with the 486 CPU. 
For single cpu systems the different architectures offer 
similar performance benefits in most cases. The reason 
they are so similar is the mechanism which improves 
performance. The primary benefit of the second level 
cache is bus cycle latency reduction. 


In most systems which incorporate a single 1486 CPU, 
bus traffic from other bus masters is minimal. With any 
reasonable memory system the CPU uses at most 50% 
to 70% of the bus. Therefore reduction of bus cycle 
latency is the only performance benefit external logic 
can offer. 


The second level cache used in this example is an eco- 
nonmical method of reducing read cycle latency. The 
485Turbocache module contains the control circuits, 
data and tag ram required to implement a 128k byte 
cache. It is organized as a two way Set associative 
cache. Modules can be cascaded to provide up to 512K 
bytes of cache memory. | 


One of the most interesting aspects of this device is it 
can be a system option. To provide this capability the 
device is configured as a look-aside cache. It monitors 
the CPU address and control signals. When a cycle 
occurs in which the cache can supply data, it inter- 
venes. The cache module then supplies an entire 
16-byte line with no wait states. 


_ The performance improvement offered by this cache is 
_ substantial in some environments. This performance 


improvement is particularly obvious when executing 
multitasking, multiuser operating systems such as 
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UNIX and OS/2. Some users, however, may not re- 
quire the performance improvement offered by the 
cache. In these cases the cache as an option is attract- 
ive. | 


By designing the cache subsystem as an option both 
user’s requirements can be met. A single system design 
can be manufactured for both customers. The UNIX or 
OS/2 user can add the cache module. Other users may 
or may not require the module. They can choose the 
system configuration which meets their price-perform- 
ance needs. 


When a single or multiple 485Turbocache Module de- 
vices are connected to an 1486 processor system, the 
processor’s internal cache should map the entire ad- 
dress space including that of the 485Turbocache Mod- 
ule devices to provide the highest performance. This is 
the most efficient configuration. The i486 CPU can ac- 
cess a line from its internal cache in one clock and the 
485Turbocache Module provides the next fastest access 
in two clocks for the first doubleword and the remain- 
ing three doublewords in three clocks. 


KEN# 


AIG 
AI17 
ney PLD 


EADS 


485TURBOCACHE 
MODULE 
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No matter how many 128-kbyte modules are cascaded, 
the set and tag addresses are connected to the same pins 
on the 485Turbocache Module. The processor’s address 


‘bits A2—A31 are connected to A2- -A31 on the 


485Turbocache Module. Internally, address bits A4— 
A15 are sent to both sets, to select one of 4,096 loca- 
tions. Because the cache is two-way set associative, 
each address points to information stored in two banks. 
On each read or write cycle, the value of A16—A31 is 
compared to the tags stored at the location addressed 
by A4—A15. If they are equal, and if the valid bit is set, 
then a hit occurs. If a read cycle is in progress, then the 
485Turbocache Module returns data to the i486 CPU. 
If the hit cycle is a write cycle, then the new data is 
updated in the 485Turbocache Module. 


When multiple 485Turbocache Modules are used, the 
chip select starts by decoding A16 onwards. For exam- 
ple, with a 256-kbyte cache A16 and A17 are decoded 
for generating the CS#. The set and tag addresses of a 
system with four 485Turbocache Modules is shown in 
Figure 1. 


SYSTEM KEN# 


START# 


SYSTEM _ 
START# START# 
START# 
485TURBOCACHE | sKens | 
MODULE 
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Figure 1. Multiple 485Turbocache Module Configuration 


5-210 


intel 


The BRDYO#¥ output and the CBRDY# input must 
be used in forming of the i486 CPU’s BRDY # input. 
Similarly, the CRDY # input must be used in forming 
of the i486 CPU’s RDY # input. Signals that are com- 
mon to the 1486 CPU and the 485Turbocache Module 
include BOFF#, BLAST #, EADS#, BEO#-BE3#, 
and DPO-DP3. 


The memory system generates KEN # to the i486 CPU 
when read data needs to be cached. The 485Turbocache 
Module receives this signal as the SKEN# input and 
produces CKEN# _ when appropriate. The 
485Turbocache Module’s CKEN # output can be used 
in the formation of the KEN # input to the i486 CPU. 
CKEN# can be used in conjunction with other logic 
that can deassert KEN # to the CPU when the system 
wants the current line fill to be cached by the 
485Turbocache Module and not cached in the i486 
CPU. The CKEN # signal is always asserted in T1, but 
is then deasserted if CS# is inactive. 


The 485Turbocache Module connects directly to the 
i486 CPU’s address lines A2—-A31. The designer may 
have to add external buffers to the address outputs, 
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depending upon the loading. Other signals connected to 
the 1486 CPU include the burst control signals, the bus 
cycle definition signals, the byte enables, the ADS# 
signal, and the data and parity signals. The 
485Turbocache Module and CPU connections are 
shown in Figure 2. The 485Turbocache Module main 
memory controller and bus controller interface are 
shown in Figure 3. 


Read Hit Cycles 


A read hit cycle occurs when requested data is present 
in the 485Turbocache Module. The 1486 CPU attempts 
to retrieve the entire line from the 485Turbocache 
Module without incurring wait states. This may be ac- 
complished by activating the KEN # input at the end 
of T1 (the clock in which ADS# becomes active). 
There is very little time to decode the address, generate 
the KEN# signal to the i486 CPU, and complete a 
zero wait state read operation. Because KEN # is sam- 
pled twice, it is possible to always assert KEN # in T1 
and to wait until the end of a line fill to decide whether 
the data is cacheable. (See Section 3.2.) 


CBRDY# | 
SKEN# 


START# 
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Figure 2. 485Turbocache Module and i486™ CPU Connections 
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Figure 3. 485Turbocache Module and Main Memory Connections 


CKEN # is used in the formation of the KEN¥# signal 
_ to the i486 CPU. Therefore, CKEN # is always activat- 
ed in T1 (see Figure 4 and Figure 5). If a read hit 
occurs, data can be sent to the 1486 CPU in zero wait 
states and can still be cached in the processor’s on-chip 
cache. The 485Turbocache Module asserts CKEN # 
which remains asserted for the duration of the read hit 
cycle (unless WPSTRP# is low and the line is write 
protected). This means that the i486 CPU will cache 
the entire line unless external logic is added to cause the 
KEN # signal to be sampled high in the clock.before 
the last BRDYO# from the 485Turbocache Module. 


If the CKEN # input from the 485Turbocache Module 
is connected directly to the KEN# input of the 1486 
CPU, then the CPU will always sample KEN # active 
at the end of T1. To deassert KEN # to the processor, 
the system must create another signal that is used in the 
formation of the i486 CPU’s KEN#, and the 
485Turbocache Module’s SKEN#. Using this. tech- 
nique a non-cacheable, non-burst cycle can be per- 
formed. 


The BRDY # signal to the 1486 CPU can be generated 
from many sources. Therefore, the various signals 
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should be logically ““ORed”’ to generate the actual 1486 
BRDY # input. 


On a cache read hit, the 485Turbocache Module gener- 
ates a BRDYO# signal for each of the doublewords it 
transfers. The 485Turbocache Module asserts 
BRDYO# in the first T2 cycle, and BRDYO# remains 
asserted for the duration of the burst. If the i486 CPU 
either terminates a burst early or fails to generate a 
burst cycle as defined by BLAST #, the 485Turbocache 
Module will deassert BRDYO# after the 1486 CPU has 
sampled the required data. 


CLK 
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Write Cycles and I/O Cycles 


The 485Turbocache Module is a write-through cache, 
sO main memory is updated with every write hit or 
miss. The 485Turbocache Module is not required to 
generate a ready signal to the 1486 CPU for write cy- 
cles. However, it does perform a comparison and up- 
dates the cache memory when a write hit occurs (pro- 
vided the location isn’t write protected). The 
485Turbocache Module is not updated on write misses. 
The timings for write operations are shown in Figure 4 
and Figure 5. _ 


WRITE MISS 


ADDR a a CE ea ———s Re LE cme ea 
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W/R# | | | | | | 
M/lO # ee eee 
I i | | 
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BLAST - —_ | 
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CKEN# = | ] he 
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START# ! | | 
| | | | | | | | I 
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Figure 4. Read Hit-Write 
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READ MISS __ (1 CLK BURST) 
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Figure 5. Write-Read Miss 


Because the 485Turbocache Module is a write-through 
‘cache, writes are immediately forwarded to the system. 
If a processor write occurs on a valid entry that is not 
write protected, the new data will be stored into the 


memory in zero wait states. The 485Turbocache Mod- © 


ule will not generate a ready signal. It is the systems’s 
responsibility to update the system memory on all 
writes and to terminate all cycles with a ready signal. 
Even after the 485Turbocache Module has completed 


its internal write update, it remains idle until the system 


returns a ready to the processor. 


_ A cache location can be write protected by asserting the 
WP input to the 485Turbocache Module. The WP sig- 
nal must be valid during the third BRDY0O# or RDY # 
of a cache line fill cycle. It sets a state bit within a 
particular cache location and remains in effect until the 
bit is invalidated. Tieng WPSTRP ¥ low will not allow 
the write protected entry to be cached by the i486 CPU 
in subsequent accesses. The entry can be invalidated by 


any of the following: a flush operation, a reset opera- 
tion, an invalidation cycle, or an LRU replacement. 


When an i486 CPU cycle produces a write hit to a 


write-protected 485Turbocache Module location, data 
in the cache is not modified. The 485Turbocache Mod- 
ule responds in the same way whether or not a write hit 
location is write protected by asserting the START # 
signal. It is the designer’s responsibility to prevent in- 
consistencies between the 485Turbocache Module and 


main memory when using the WP signal. 


The 485Turbocache Module ignores all I/O cycles. 
When an I/O cycle is executed by the i486 processor, 
the system responds and terminates the cycle. The 
485Turbocache Module does not assert the START # 
signal for I/O accesses, and the system should monitor 
the M/IO# signal rather than wait for the assertion of 
the START # signal. | 
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System Cacheability Indication 


The 485Turbocache Module uses the cache enable 
scheme of the i486 CPU. A cache update to the 
485Turbocache Module requires activating the 
SKEN # signal. The signal is sampled twice, first on 
the rising clock edge before the first ready signal from 
BRDY # or RDY #, and again on the rising clock edge 
before the last ready. If SKEN# was deasserted at ei- 
ther of the specified sample times, then the access 1s 
considered non-cacheable. SKEN#¥ is ignored during 
write cycles. 


Typically, the system will use the same logic to gener- 
ate the i486 CPU’s KEN#¥ = signal and_ the 
485Turbocache Module requires activating the 
SKEN # signal. However, it is not necessary for both to 
be asserted during an access. It is possible to use differ- 


ent cacheing maps for the CPU cache and the’ 


485Turbocache Module cache because the 1486 CPU 
and the 485Turbocache Module maintains their own 
cache contents via snooping. 


Cascadable Cache 


The 485Turbocache Module can be cascaded to config- 
ure a deeper cache memory for the processor. Up to 
four can be used to provide as much as 512 kbyte of 
cache. 


System Control Signals and Cascadable 
Caches 


The START ¥# signal used by memory is the logical OR 
for each individual 485Turbocache Module START # 
output. If any cache has information that is needed by 
the processor, then its START*# signal is at a high 
level, and it inhibits the main memory START # signal 
(as there is no need to access the main memory). If 
needed data is not present in any of the 485Turbocache 
Modules, then the START ¥ signals are low, and main 
memory data is accessed. 


The KEN# input to the i486 processor should be a 
logical OR for each of the 485Turbocache Modules and 
for a memory controller output. The memory control- 
ler output can be asserted high to indicate that the in- 
formation to the i486 CPU is non-cacheable. 


The SKEN# signal is the cache input to the 
485Turbocache Module. The memory controllers must 
assert SKEN# when a transfer to the 485Turbocache 
Module is cacheable. The SKEN # inputs for all of the 
485Turbocache Modules must be tied together. The 
controller that has its CS# asserted determines which 
cache will receive the information. 
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The EADS# signal from the memory controller must 
be connected to the i486 CPU and to all of the 
485Turbocache Modules. In this way, invalidation cy- 
cles are executed in all the 485Turbocache Module de- 
vices simultaneously. 


The entire memory space is covered in a single cache or 
a cascaded cache configuration. When multiple 
485Turbocache Modules are used, only’ one 
485Turbocache Module is selected by asserting the 
CS# pin. 


For example, TAO through TA15 are always connect- 
ed to Al6 to A31. In the configuration with one 
485Turbocache Module, the chip select is grounded. In 
the two 485Turbocache Module configurations, A16 is 
used to decode between the two caches. In the four 
485Turbocache Module configurations, Al6 and A17 
are used to generate the CS# signals. 


3.0 PROCESSOR FEATURE REVIEW 


The improvements made to the CPU bus interface obvi- 
ously impact the memory subsystem design. It is im- 
portant to understand the impact of these features be- 
fore attempting to define the system. This section is a 
review of the bus features which affect the memory in- 
terface. The features and their impact on memory sys- 
tem design is discussed. 


3.1 The Burst Cycle 


The 1486 CPU’s burst bus cycle feature has more im- 
pact on the memory logic than any other feature. It is 
the most significant departure from previous bus archi- 
tectures. A large portion of the control logic is dedicat- 
ed to supporting this feature. The second level cache is 
also primarily dedicated to supporting burst cycles. 


To understand why the logic is designed this way, we 
must first understand the function of the burst cycle. 
Burst cycles are generated by the CPU if, and only if, 
two events occur. First, the CPU must request a cycle 
which is longer in bytes than the data bus can accomo- 
date. Second, the BRDY # signal must be activated to 
terminate the cycle. When these two events occur a 
burst cycle will take place. Note that this cycle will 
occur regardless of the state of the KEN# input. The 
KEN # input’s function is discussed in the next section. 


With this definition we see that several cases are includ- - 
ed as “‘burstable”. Some examples of burstable cycles 
are listed in Table 3. These cycle’s length is shown in 
bytes to clarify the case listed. 
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Table 3 | 


Burst Bus Cycle 


The last case shows that write cycles are burstable. In 
this case a write cycle is transfered on an 8 or 16 bit 
bus. If BRDY # is returned to terminate this cycle the 
CPU will generate another without activating ADS#. 


Using the burst write feature has debatable perform- 
ance benefit. Some systems may implement special 
functions which benefit from the use of burst writes. 
However, the 486 CPU does not write cache lines. 
Therefore, all write cycles are 4 bytes long. Also, most 


of the devices which use dynamic bus sizing are read — 


only. This fact further reduces the utility of burst 
writes. 


Due to these facts, the design example used here does 
not implement burst write cycles. In fact, the BRDY # 
input is only asserted during main memory read cycles 
and cache hit cycles. RDY# is used to terminate all 
memory write cycles. RDY # is also used for all cycles 
which are not in the memory subsystem or are not ca- 
pable of supporting burst cycles. The RDY# input is 
used, for example, to terminate an EPROM or I/O cy- 
cle. 


ADS# 


BLAST# 


KEN# 


DATA 
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3.2 The KEN # input 


The primary purpose of the KEN # input is to deter- 
mine whether a cycle is to be cached. Only read data 
and code cycles can be cached. Therefore, these cycles 
are the only cycles affected by the KEN # input. 


Figure 6 shows a typical burst cycle. In this sequence 
the value of KEN# is important in two different 
places. First, to begin a cacheable cycle KEN # must be 
active the clock before BRDY# is returned. Second, 
KEN # is sampled the clock before BLAST # is active. 
At this time the CPU determines whether this line will 
be written to the cache. | 


The state of KEN# also determines when read cycles 
can be bursted. Most read cycles are initiated as 4 byte 
long from the CPU’s cache unit. When KEN # is sam- 


' pled active the clock before BRDY# or RDY # is re- 


turned, the cycle is converted to a 16 byte cache line fill 
by the bus unit. This way, a cycle which would not 
have been bursted can now be bursted by activating 
BRDY #. 


Some read cycles can be bursted without activating 
KEN#. The most prevalent example of this type of 
read cycle is code fetches. All code fetches are generat- 
ed as 16-byte cycles from the CPU’s cache unit. So, 
regardless of the state of KEN#, code fetches are al- 
ways burstable. In addition, several types of data read 
cycles are generated as 8-byte cycles. These cycles, 
mentioned previously, are descriptor loads and floating 
point operand loads. These cycles can also be bursted at 
any time. 
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Figure 6. Typical Burst Cycle 
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It’s obvious that the use of the KEN# input affects 
performance. The design example used here illustrates 
one way to use this signal effectively. 


The primary concern when using KEN # is generating 
it in time for zero wait state read cycles. Most main 
memory cycles will be zero wait state if a second level 
cache is implemented. In this example, the main memo- 
ry is one wait state during most read cycles. Any Cache 
access will take place with zero wait states. KEN # 
must, therefore, be valid during the first T2 of any read 
cycle. 


Once this requirement is established, a problem arises. 
Decode functions are inherently asynchronous. There- 
. fore, the decoded output which generates KEN # must 
be synchronized. If not, the setup and hold times of the 
CPU will be violated and internal metastability will re- 
sult. With synchronization, the delay required to gener- 
ate KEN #¥ will be at least three clocks. In this example 
4 clocks are required. In either case the KEN # signal 
will not be valid before BRDY # is returned for zero or 
one wait state cycles. 


This problem is resolved if KEN# is made normally 
active. Figure 7 illustrates this function. In this diagram 
KEN # is active during the first two clocks of the burst 
cycle. If this is a data read cycle, KEN # being active at 
this time causes it to be converted to a 16 byte length. 
The decode and synchronization of KEN # takes place 
during the first two T2 states of the cycle. If the cycle 
turns out to be non-cacheable, KEN# will be deacit- 
vated in the third T2. Otherwise KEN# will be left 
active and the data retrieved will be written to the 
cache. 


ADS# 


BLAST# 
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Some memory devices may be slow enough that 16-byte 
cycles are undesireable. In this case more than three 
wait states will exist. The KEN # signal can be deacti- 
vated prior to returning RDY# or BRDY # if three or 
more wait states are present. As a result these slow 
cycles will not be converted to 16-byte cache line fills. 


3.3 Bus Characteristics 


The internal cache causes other effects which impact 
the memory subsystem design. Perhaps the most obvi- 
ous of these is the effect on bus traffic. The fact that the 
internal cache uses the write-through policy dramati- 
cally increases the number of write bus cycles. Fig. 8 
illustrates this effect. The top chart shows the bus cycle 
mix for an application executed with the 386DX CPU. 
The bottom chart shows the same application executed 
with the 1486 CPU. The percentage of write bus cycles 
jumps to 70% from 30% when this application is exe- 
cuted with the i486 CPU. 


It seems intuitively obvious that many of these write 
cycles would be consecutive. In fact, 70% -of all write 
cycles are consecutive. Furthermore, 50% of all write 
cycles occur three in a row. It is obvious from these 
statistics that optimizing the memory subsystem for 
write cycles can improve performance. But it is impor- 
tant to optimize the memory system for consecutive 
write cycles. Improving individual write cycle latency 
will not buy much performance if subsequent write cy- 
cles suffer. 


A technique called write posting proves ideal for this 
purpose. This technique allows consecutive write cycles 


- to be overlapped. It also allows write cycles to be over- 


lapped with second level cache cycles and reduces over- 
all write miss latency. 
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Figure 7. Burst Cycle KEN Normally Active 
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Using the write posting technique adds complexity to 
the system logic. It is therefore valid to ask what per- 
formance improvement is gained by using this tech- 
nique. This question 1s especially pertinent when we 
consider the logic already implemented in the 1486 
CPU to improve write performance. The internal 1486 
write buffers decouple the processor execution unit 
from the external bus. 


Analysis has shown that, in general, 6% degradation in 
performance can be expected for every additional wait 
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state added to write cycles. This analysis was _ per- 
formed by measuring the CPU clocks required to exe- 
cute several applications. 


The same analysis has shown that write posting reduces 
average write latency to 2.5 clocks. Without write post- 
ing average write latency is 4 clocks. From this data we 
can conclude that approximately 9% performance im- 
provement can be obtained by using write posting. This 
improvement may increase due to other affects. These 
affects, such as overlapping write cycles with cache 
reads, are discussed in subsequent sections. 
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Figure 9. KEN # Logic for Second-Level Cache 


5-219 


intel 


4.0 DRAM INTERFACE OVERVIEW 


The i486 CPU bus interface unit integrates several © 


functions which improve the memory access rate. 
These features must be supported by the memory sub- 
system to provide the intended performance benefit. 
They are supported by the memory subsystem example. 
The example also includes logic support for a second- 
level cache. An overview of the subsystem is presented 
in this section. Details of the function and logic design 
of this subsystem are presented in later sections. 


This subsystem follows a modular design. Only minor 
changes to particular logic sections are needed to imple- 
ment variations. For instance, the PLD which gener- 
ates the CAS # signal needs only minor changes to sup- 
port Static Column mode DRAMSs. It is also simple to 
implement a non-interleaved DRAM controller based 
on this design. 


Other possible optimizations will be pointed out 
throughout the discussion. This first section summa- 
rizes the features and functions present in the design 
example presented in this section. | 


4.1 Functional Blocks 


Two common design techniques are employed in inter- 


facing the 1486 CPU to DRAMs. The first, interleaving, ~ 
_ 1s used to support the burst bus feature. The second, 


write posting, is used to reduce write cycle latency. 
Both techniques improve performance, and without 
them, performance is degraded by the access require- 
ments of currently available DRAMs. 
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- Interleaving can be implemented in several ways. Here, 
_ alternate 32-bit DRAM banks are accessed.The bank | 


accessed is determined by the value of A2. In this 
way,even DWORDs (A2=0) are stored. in one bank 
while odd DWORDs (A2=1) are stored in the other. 
When data is retrieved from memory during a cache 
line fill, cycles are overlapped to allow single clock 
DWORD accesses. Timing of this operation is detailed 
in the next section. 


A multiplexor alternates data flow between the DRAM 
banks and the appropriate data path is selected accord- 
ing to the value of A2. The multiplexor prevents bus 
contention. 


_ With write posting, bus cycles are again overlapped to 


reduce latency. Figure 10 illustrates how this technique 
is applied within the write cycle. The RDY# signal 
terminates the cycle in the clock after ADS# becomes 
active. This creates a zero- waitstate write cycle, the 
fastest possible. 


When the cycle terminates, however, data must still be 
written to memory. The delay allows additional 
DRAM access time. Figure 10 shows that data is actu- 
ally written to memory two clocks after RDY # is re- 
turned to the CPU. The CAS# signal completes the 
write cycle four clocks after it is started by the CPU. 


Write data and address registers support the posted 
write function by holding write data and address after 
RDY # is returned to the CPU. These registers are re- 
quired to allow the CPU to start another cycle immedi- 
ately following the first (see Figure 10). ADS # is acti- | 
vated in the clock after RDY # is returned to the CPU. 
This cycle starts before the first is a acta and the 
cycles overlap by two clocks. 
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Figure 10. Write Posting 
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In effect the write cycle completes in two clocks. Write 
cycles can be overlapped in this manner indefinitely. 
The timing and logic required to support this function 
is described in Section 5.3. 


Address registers also support invalidation with the 
AHOLD signal. They are required if AHOLD is acti- 
vated when bus are cycles in progress to hold the cur- 
rent address while the bus cycle completes. 


The efficient CPU interface and invalidation support 
make this DRAM subsystem well-suited for use with 
an optional cache. The memory system includes specif- 
ic functions designed to support the optional 486 Tur- 
bocache module. The subsystem supports 256K x 4 
and 1Mbyte <x 1 DRAM configurations.The minimum 
memory configuration is 2 Mbytes with 256K xX 4 de- 
vices; the maximum is 16 Mbytes with 1Mbyte x 1 
devices. Additional banks can be added to increase the 
memory capacity. 


The control logic for this example is implemented with 
EPLDs.The modular approach allows quick modifica- 
tion so that the example can be tailored for specific 
implementation requirements. 


The control state machine is distributed among the var- 
ious EPLDs, and each functional block receives control 
input from other blocks. In addition most of the func- 
tional blocks are implemented as state machines. 


Figure 11a is a top level block diagram of the memory 
system. This diagram depicts the sections of logic that 
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Figure 11a. Memory Subsystem Block Diagram 
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will be described subsequently. We will first discuss the 
address path logic. 


4.2 Address Path Logic 


Unlike processors without on-chip caches, the address 
bus of the 1486 processor is bidirectional. The address 
pins serve as inputs whenever external memory is 
changed by DMA or another CPU. The address is driv- 
en into the CPU to invalidate the corresponding cache 
entry if present. 


Invalidation of the 486 CPU’s internal cache can be 
performed in several different ways. This example sup- 
ports invalidation cycles during a memory access. 


As described in the previous section, AHOLD is used 
to perform the invalidation function. AHOLD tristates 
the 486 address bus. Address registers must be used to 
hold the address to allow the current bus cycle to be 
completed. These registers hold the current address 
when AHOLD is activated. 


The registers shown in Figure 11b hold the entire row 
and column address, as well as the current byte enables 
and control definition. These signals are latched at the 
rising clock edge of the first T2 of a bus cycle. They 
must be held from this edge to allow zero wait state 
write cycles. 
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Figure 11b. Address Path Logic 


Registers with enable inputs are needed. The enable in- 
put can select the CLK edge appropriate for latching 
the address and control state. The control logic gener- 
ates the enable signal ALD which disables the CLK 
input of the registers during a bus cycle. When ALD is 
active (High) the current row and column addresses are 
held in the registers. 74AS823 registers have enable in- 
puts and are used in this example. | 


An additional address register is required for posted 


write cycles. This register holds the write column ad- | 


dress. The address is latched only on write cycles and is 
held until the write cycle completes at the DRAM. | 


Separate write and read address paths are implemented 
with a 3 to 1 address multiplexor. The read address 
path is required to meet the timing of a three CLK read 
cycle. In this case the read address must propagate 
through the address mux one CLK sooner than the 
write address. If the initial read access is 4 CLKs long 
the read and write address paths can be combined. See 
section 5.1 for a complete description of read cycle tim- 
ing. The third address path is for the row address. 


A delay line is used to meet the row address DRAM 


hold time requirement(tRAH). The RAS # signal is de- _ 


layed 20ns to create the DRAS# signal. This signal is 
used as the multiplexor path select input. When 
DRAS # is inactive (high). the multiplexor always se- 


lects the row address path. When DRAS# is active 


(low) the mux enable signal (MENO# or MENI#) 
controls whether the read path or the write path is se- 
lected. 


The comparator and register combination is connected 
to the row address path to generate the HIT # signal. 
This signal indicates that the current cycles address is 


_in the same DRAM row as that of the previous cycle 


and also determines whether RAS# will be deactivat- 
ed. 


In this example a standard component designed specifi- 
cally for this purpose is used. This component contains 
a register and a comparator. The register in this compo- 
nent holds the previous row address. When a bus cycle 
occurs to a new. DRAM row, the new row address is 
latched. The RALE signal enables the row address 
latch. — : _-~ 


The timing of this component meets the requirements 


_ of a 33 MHz CPU clock. Discrete registers and com- 


parators can be used to improve the timing of the 
HIT # signal, if desired. 


The last important address logic component is the burst 
address generator. This state machine generate A3 and 
A2 during burst accesses and is needed to achieve zero 
wait state performance during burst cycles. It predicts 


_the value of A2 and A3. Section 5.6 contains a com- 


plete description of the burst cycle timing. | 
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Note that because interleaving is used, A3 is the lowest 
order DRAM address. Two A3 equivalent signals are 
generated. One for Bank 0 (BOAO) and one for Bank 
B1AO. These signals are connected directly to the 
DRAM devices to meet critical timing requirements. 
The signals must also reflect the lowest order row ad- 
dress during miss cycles. As a result A13 1s, therefore, 
an input to this logic. It is the lowest order row address 
when 1MBx!1 DRAMs are used. 


4.3 Data Path 


A2 must also be predicted during burst read accesses. 
For this purpose, the burst address logic creates the 
DATASEL signal. DATASEL reflects the value of A2 
for each access of a burst cycle and is used to control 
the data multiplexor as shown in Figure 12. 


During burst cycles, the data multiplexor alternates be- 
tween the bank 0 and bank | data paths. A2 must alter- 
nate states each clock for interleaving to function prop- 
erly. The i486 CPU’s burst address sequence is defined 
such that A2 changes state on every access. 


A2 also selects the bank to which data is written. Data 
path logic is not involved in steering data during writes. 
Figure 12 shows separate data registers for each bank. 
Separate registers are only required to divide the data 
paths. These registers hold the same write data on every 


DATASEL 
MBRDY# 
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write cycle. The CAS# and WE# (write enable) sig- 
nals control doubleword and byte steering. 


Because of write data timing, the data registers must 
have the enable function. This function, can be used to 
select the clock upon which data is latched. The proces- 
sor clock can be used as the register clock input to 
guarantee proper data setup and hold times. 


As Figure 12 indicates, the MRDY # signal enables the 
write data registers and terminates memory write cy- 
cles. Data is therefore latched during the last clock of 
any write cycle. 


MRDY*#¥ is restricted to write cycles while the 
MBRDY # signal is used for read cycles. The need for 
these signals illustrates the convenience of the CPU’s 
dual-ready inputs. The MBRDY*# signal enables the 
output of the data path multiplexor to prevent bus con- 
tention. 


These ready signals are combined with similar system 
logic signals to form the processor RDY# and 
BRDY # inputs. I/O, peripheral and other non-burst 
devices can use the RDY # input. Burst devices, such 
as a second level cache controller must also use the 
BRDY # input. The MBRDY # and MRDY ¥ signals 
are, therefore, used only with the DRAM control logic. 
They are isolated from the rest of the system by combi- 
natorial logic. 
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Figure 12. Data Path Logic 
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4.4 Second Level Cache Support 


Second level cache strategies for the 1486 CPU are di- 
verse and application dependent. The example de- 
scribed illustrates a second level cache strategy that is 
ideal for single CPU systems. . 


The 485Turbocache second level cache used in this ex- 
ample is optional and is used to complement the i486 
internal cache to improve the performance when run- 
ning complex applications and operating systems. Some 
users will not require the extra performance. Since the 


this cache configurations. 


cache is optional, O.E.M.’s or end-users can decide 


whether it should be included. System board design and 
manufacturing costs are thus eased since one system 
board supports multiple performance requirements. — 


The 485Turbocache is a completely self contained 
cache module. Optionality is accomplished by includ- 
ing control logic, tag ram and data ram in one package. 
A socket is added to the system board in much the 
same manner as a math coprocessor socket. In systems 
which, for example, run UNIX, the cache module is 
simply plugged in. . 
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This option must; of course, be supported by the system 
logic. Specifically, the memory control logic is directly 
interfaced to the cache module. The DRAM controller 
example described here is particularly well- suited for 


The support included in the 485Turbocache module’s 
memory control logic for the 485Turbocache module is 
illustrated in Figure 13. Since the 485Turbocache is a 
write-through cache, provision must be made for read 
cycles. When read data is found in the second level 
cache, the cycle is called a cache hit. At the time this 
cycle is determined to be a cache hit, it has already been 
started in the DRAM controller. This cycle must be 
aborted by the DRAM controller. 


The BRDYO# signal from the 485Turbocache module 
provides a convenient cache hit indication. This signal 
is included in the decoder function. When a cache hit 
occurs, the DRAM controller aborts the cycle. The 
meimory chip select signal is not activated and the first 
level control logic is reset aborting the cycle. The con- 
trol logic then waits for another cycle to start. This 
function is very similar to the back-off function. 
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Figure 13. Logic Required for Optional 485Turbocache Module 
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Like the i486 internal cache, the 485Turbocache mod- 
ule supports non-cacheable memory by decoding. The 
SKEN # input is analogous to the 1486 CPU’s KEN # 
input. This function is also supported by the decode 
logic. Note that, as with the KEN# signal, SKEN # 
must be synchronized to the CPU clock. 


Separate cache enable inputs also allow areas of memo- 
ry to be noncacheable in the i486 CPU internal cache 
yet cachable in the second level cache. This feature is 
convienient for BIOS. 


4.5 Control Logic 


Memory control logic generates the signals that control 
the memory devices, multiplexors, and registers de- 
scribed earlier. These control signals can be generated 
in a variety of ways. This example employs a distribut- 
ed state machine. 


Since this example is a prototype, PLDs were the logi- 
cal choice for the controller implementation. Because 
the number of terms in a PLD is limited, the state ma- 
chine implementation must be distributed. Function 
distribution was determined based on this constraint. 
Figure 14 shows a block diagram of the controller, with 
each block made up of one or two PLDs. 
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There are two levels of logic in the controller shown in 
Figure 14. The first is made up of two PLDs, one which 
tracks bus cycles and another which generates the 
MRDY # signal. The first level signals to PLDs in the 
second level that a cycle has started. The second level is 
made up of several PLDs which generate the actual 
control signals such as RAS# and CAS#. 


Implementing the controller in this manner has two 
important advantages. First, more decode time is al- 
lowed. The cycle start signal, CIP#, is used by the 
second level logic to sample the decode output. CIP # 
is valid in the first T2 of any bus cycle. As a result, 
decode does not need to be valid until the end of this T2 
bus state. Without this function, the decode output 
must be valid at the end of every T1 bus state. In this 
case, the time allowed for decode at 33 MHz 1s very 
short. With 7-ns PLDs, the time allowed for decode 
would be 7ns. With 5-ns PLDs, this time is still only 
9ns. The advantage of the extra clock period is clear. 


The second advantage of the two level approach is simi- 
larly clear. The AQO signal indicates the start of a bus 
cycle to all second-level PLDs. Without this signal 
ADS# would have to be connected to these devices, 
and the resulting load on ADS# would be prohibitive. 
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Figure 14. Control Logic Overview 
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Invalidation within bus cycles is another case that 
makes decode design difficult. The AHOLD signal 
must be used to implement this function. As its name 
implies, AHOLD can be active in any clock. If 
AHOLD is active in the first clock (Ti) of a bus cycle, 
the CPU address lines are tristated in T2. Unless de- 
‘code is latched at the begining of T2, it will not be valid 
for the DRAM cycle. 


The two-level approach allows decode to be a transpar- 
ent function. The decode circuit is shown in Figure 15. 
The 85C508 shown here includes a. flow-through latch 
function. Using this function, the decode outputs can 
be latched. The DALE signal is generated at the beg- 
gining of the first T2 of any bus cycle. This signal acti- 
vates the latch input of the 85C508. In this manner, 
decode is held during T2. If AHOLD is active in T1, 
the decode outputs may not be valid in T2. In this case, 
the cycle must not be started until the CPU address is 
redriven. Cycle-tracking PLD handles this function. By 
delaying the cycle start signal, the DRAM cycle is de- 
layed. When AHOLD is deasserted, the CPU redrives 
the address again. ‘At that time, CIP # is activated and 
the cycle begins. If AHOLD is active in any other 
clock, the bus cycle can continue normally. 


The first level of interface with the memory subsystem, 
the cycle tracking PLD handles many other functions, 


most of which relate to synchronization. Refresh syn- 
chronization is one example, as is determining the 
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RAS# precharge duration. AQO# is not the only sig- 


- nal which supports the AHOLD function. Address reg- 


isters, controlled by the PLD, generate the ALD signal 
to disable the registers during bus cycles. These and 
other functions of the control logic are described com- 
pletely in Section 5.11. 


The PLDs in the next level of logic perform more spe- 
cific functions. RAS# and CAS# are generated at this 
level, and the PLDs that generate these signals are de- 
voted solely to this function. The RAS# PLD gener- 
ates four RAS# signals, RASO# —-RAS3#. These sig- 
nals are identical but drive different DRAM modules to 
reduce the load on the RAS# signal. 


The RAS# function is designed to support page or 
static column mode memory devices. To support these 
devices, RAS# must be left active between accesses to 
the same row. The RAS# state machine is designed so 
that RAS is deactivated only for a refresh or page miss 
cycle. This module generates RAS# for both DRAM 
banks. 


For the CAS# function, the PLD’s are e responsible for 
implementing burst accesses. During write cycles, the 
CAS # signals determine which DRAM bank is written 
to. All even doublewords (A2 = 0) are stored in bank 0 
while odd doublewords (A2 = 1) are stored in bank 1. 
When data is retrieved from memory, cycles can be 
overlapped. to allows zero wait state burst accesses. 
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Figure 15. Decode Logic 
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Address generation is another important consideration 
in burst accesses. The address for the last three access 
of a burst must be generated by logic because the CPU 
cannot generate these addresses in time to allow zero- 
wait state accesses. The burst address logic shown in 
Figure 14 is actually two PLDs which generate the 
burst address for bank 0 and bank 1, respectively. The 
burst address consists of two signals- -the lowest order 
DRAM addresses from each PLD. 


Because of timing constraints, these signals are con- 
nected directly to the DRAM devices. The burst ad- 
dress PLD must generate the burst address, provide the 
multiplexer function for row and column addresses and 
generate the write address. The burst address signals 
must, therefore, reflect the value of A1l3 during miss 
cycles. These reflect during burst read and write cycles. 
These signals reflect A3. 


BOOMAO and BO1MAO are the burst address signals for 
bank 0. Two identical signals are used to divide load- 
ing. BIOMAO and B11MAO are the burst address sig- 
nals for bank 1. A detailed description of the burst ad- 
dress function is given in Sections 5.6 and 5.16. 


The DSEL PLD main function is to generate the data 
select signal. As described above, this signal is used 
during a burst to switch the data path multiplexer. It 
reflects the value of A2 during burst read cycles only 
and is one component of the burst address. The DSEL 
PLD also generates the RALE signal to control the row 
address register described above. 


BRDY # terminates all read cycles. MBRDY # is gen- 
erated by the MRDY PLD and is separated from the 
RDY # signal to facilitate posted writes by preventing 
data bus contention. When a write cycle is immediately 
followed by a read, the read cycle must be delayed. This 
delay is implemented by delaying MBRDY # until the 
previous write cycle is complete. MBRDY# is com- 
bined with other burst ready inputs using combinatorial 
logic. 


WIP # (write in progress) indicates to the MRDY PLD 
that a write is taking place, and MBRDY # is not gen- 
erated unless this signal is inactive. WIP# tracks the 
state of the CAS# state machines. | 


The WE PLD generates WIP # and other signals asso- 
ciated with the write function. The MUXEN # signals 
control the address multiplexors and activate the write 
address path during write cycles. The WE# signals are 
used to create the DRAM W inputs and to implement 
byte steering. They are combined with latched CPU 
byte enables using combinatorial logic. In this way, 
DRAM W inputs are not active for unselected bytes. 
Data bus contention on unselected bytes is prevented 
by controling the write data register output enables. 
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By implementing byte steering in this way the CAS# 
logic is, simplified. The CAS# timing path is critical 
during burst read cycles, and by placing the byte steer- 
ing logic in the write enable path, CAS # timing restric- 
tions are eased. 


The MRDY# signal terminates all write cycles. The 
logic used to generate this signal is unusual because it 
uses the ADS# input and is therefore at the first level. 
This configuration is needed to implement zero wait 
state write cycles. 


MRDY# must be active by the end of the first T2 to 
terminate a write cycle and maintain zero wait-state 
performance. To meet this restriction, it must be active 
during any write cycle, or before decode is available 
because the CPU RDY¥# signal must not be activated 
during non-memory write cycles, MRDY # is inhibited 
by the decode output, MEMCS#, in combinatorial log- 
ic. 


5.0 MEMORY SUBSYSTEM FUNCTION 


In this section we will explore the function of the mem- 
ory subsystem in detail. Each of the signals will be de- 
scribed, and bus cycles will be illustrated to show the 
memory logic function. 


The bus cycle description in this section is specific to 
this example. Signals such as KEN # and RDY*#, for 
example, are shown as they are driven by this particular 
control logic. The signals are not restricted to the tim- 
ing shown here. 


A list of the memory control signals follows. 


Memory Interface Signals 


5.1 CPU Interface Signals 


KEN # KEN # 1s an input to the proc- 
essor, indicating whether the 
next bus cycle is cacheable or 
not. This signal is a logical 
AND of SKEN # and CKEN # 


signals. 


PBRDY # is the burst ready in- 
put to the processor. This is a 
logical AND of the BRDY# 
signal from the system and the 
BRDYO# from the second lev- 
el cache. 


PBRDY # 
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5.2 Data Path Control 


DATASEL 


MRDY # 


- MBRDY# 


WE0#/WE1 # 


‘WBE00 #-WBE03 # 


WBE10#-WBE13 # 


DATASEL reflects the value of 
A2 during burst accesses. It is 
used to control the data multi- 
plexor for bank 0 and bank 1 
data paths. 


-MRDY‘# enables the write data 


registers that are used to sup- 
port write posting and termi- 
nates memory write cycles. 

MBRDY # is used for read cy- 
cles and enables the output of 
the data path multiplexor. 

WEO# and WE1# signals en- 
able the outputs of data write 
registers used for write posting. 
Both the signals are active dur- 


ing a write and CAS# deter- | 


mines the correct bank to which 
the data is written. 


-WBE00#-WBE03# are a 


combination of write enable and 
byte enable signals. They con- 
trol which byte is written into 
bank 0 during a write cycle. 


WBE10#-—WBE13# control 
which byte is written into bank 
1 during write cycles. 


5.3 Address Path Control | 


ALD 
MUXENO#,1# 


RALE# 
DALE# 


BOOMA0/B01MA0O 


ALD disables the clock input to 


the registers that hold the row 


and column addresses corre- 
sponding to. the current bus cy- 
cle. | 


_ MUXENO#, MUXEN1# con- 
trol signals are inputs to the ad- 


dress multiplexors and are used 
in selecting the read or write 
paths to the respective banks. 


RALE# enables the row. ad- 
dress latch, allowing a new row 
address to be latched for succes- 
sive bus cycles. 


DALE # activates the latch in- 
puts of the decode logic in the 
first T2 of a bus cycle and holds 
the decode during the bus cycle. 
BOOMAO and BO1MAO are the 
burst address signals for bank 0. 
They correspond to the value of 
A3 during burst read cycles. 


_ BIOMA0/B11MAO 


~B10MAO and B11MAQO are the 


burst address signals for bank 1. 
They correspond to the value of 
A3 during burst read cycles. 


5.4. DRAM Interface 


HIT # 


WIP# 


CIP # 


RASO-3 # 


DRAS# 
RFERQ 


RFACK 


PCHG 


CASO #/CAS1 # 


MEMCS # 
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HIT# is active if the row ad- 
dress for the current memory. 
cycle is the same as the previous 
memory cycle. 


WIP # indicates that a write cy- 
cle is in progress and a read to 
the DRAM needs to be delayed 
till WIP # becomes inactive. 


CIP #indicates a memory cycle 
is in progress. If the current cy- 
cle is not to DRAM, CIP# is 
deactivated else it remains ac- 
tive till the end of the bus cycle. 


RASO-3¥# go active for a valid 
row address. It remains active 
between accesses to the same 
row and is de-activated only for 
page miss and refresh cycles. 


DRAS# is the delayed RAS# 
signal to accomodate the RAS # 
hold time requirements. 


RFRQ indicates that a refresh 
of the DRAM is required. This 
signal is activated every 15.6 us. 
RFACK is asserted as a re- 


sponse to RFRQ and indicates 
that the DRAM controller is 


‘ready to perform the refresh cy- 


cle. It is active during idle cy- 
cles or after the current cycle is 
complete. 


PCHG determines the timing of 
refresh cycles and RAS# pre- 


- charge count. 


CASO# and CAS1 # signals are 
active when a valid column ad- 
dress is present on the bus and 
control the bank to which the 


- data is written into. 
. MEMCS¥ is active when a 


read or a write is performed to 


_ the DRAM. It is the synchro- 


nized output of the address de- 
coder. 
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5.5 Controller Signals M# M ¥ indicates the occurrance of 
a write miss. 
CT CT indicates that a new cycle BRDYO# BRDYO# is a burst ready sig- 


had started while a cycle was in 
progress or the refresh cycle 
was taking place. It is de-acti- 
vated when the pending cycle is 
recognized. 


SKEN # indicates if any of the 
caches is enabled. It is an input 
to the second level cache and is 
similar to the KEN # signal in- 
put to the processor. 


CKEN# is the output of the 
second level cache. It is activat- 
ed twice for a valid line fill - 
first to enable a 485Turbocache 
cache line fill and the second 
time to validate it. 


LA2 and LA313 are latched 
versions of address lines A2 and 
A113. LA313 is the lowest order 
DRAM address line. The multi- 
plexor output reflects A3 when 
RAS# is loand Ai13 when 
RAS # is high. 


SKEN # 


CKEN # 


LA2, LA313 


CLK 
ADS# 

CIP# 

ALD 

A2~A31 BEX# 


MEMCS# ZZ77Z77LLTLLLLLTALLL Ee all 


HIT#/MISS Ra 


DADDRO 
CASO 
CAS1# 

DDATAO 

DADDR1 

DDATA1 


DATASSEL 


nal driven by the second level 
cache. It is activated when a 
read hit occurs in this cache. 


5.6 Read Cycles 


Timing Diagram 16 shows a burst read cycle. At the 
start of the bus cycle, RAS# is inactive. This case is a 
rare occurence because RAS# is normally active. Un- 
less a cycle is the first bus cycle after a reset or refresh 
cycle, RAS# will be active in T1. 


It is useful to examine this case because it demonstrates 
a complete DRAM cycle. The basic function of most of 
the control logic is illustrated. 


The cycle begins with the activation of ADS#: The 
controller samples this signal and activates both ALD 
and CIP#. The CPU address registers are disabled by 
ALD. Therefore, the previously latched address is held 
throughout the bus cycle. The latched address is valid 
in the first T2 of the bus cycle. 
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Figure 16. Burst Read Cycle 
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The row address comparison is made with this address. 
As a result, the HIT # signal is not valid until the rising 
edge of the second T2. At this rising clock edge, the 
CIP#, MEMCS# and HIT# signals are sampled. If 
MEMCS ¥ is sampled active, the RAS# signal is acti- 
vated. 


The delay line holds the DRAS# signal high for 20 ns 
after RAS # is activated. In this way the row address is 
maintained to meet tRAH, the row address hold time. 
When DRAS# is activated, the address multiplexers 
switch to the column address path. The MUXEN #¥ sig- 
nals are not active, and the read path is selected. 


In the third T2 of the bus cycle CAS # is asserted. This 
cycle begins with A2 low and the first access is to bank 
0. Due to the access time of the DRAM two clocks are 
required to retrieve data from memory. MBRDY # is 
asserted in the fourth T2 of the bus cycle, and this 
action completes the first access of the burst read. The 
access is completed in five clocks. The minimum time 
for this access is two clocks indicating that three wait- 
states were added to the first cycle. 


The timing diagram reveal two important points about 
burst cycle implementation. First DRAM access re- 
quires two clocks. Second, the burst address from the 


CPU is not available until the clock after MBRDY # is 


sampled active. These circumstances make implement- 
ing zero-wait-state burst cycles difficult. The DRAM 
bank interleaving alleviates this difficulty. 
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The first advantage of interleaving is revealed in the 
second and third T2 states. Access to both the first and 
second memory doublewords can be made simulta- 
neously. This function requires that the burst address 
be predicted. As mentioned above, the burst address 
from the CPU is not available until several clocks later. 
The burst address for both the first and second accesses 
is generated in the second T2. Therefore, CAS# for 
both banks can be asserted in the next T2 state. 


The second advantage of interleaving is seen in fifth T2 

of the burst cycles in which DATASEL switches the 
data multiplexer. The second doubleword is driven on 
the CPU data bus. In this CLK, the burst address for 

the third access of the cycle is generated. CASOO# and 

CASO1# are also deasserted to begin the third access. 

Note that this access is started before the second access 

is completed. The cycle overlap shown allows new data 

to be driven on the CPU data bus every clock. This way 

zero-wait-state access is achieved. 


Timing is even more critical during page hit cycles. Fig. 
17 shows the timing of this cycle. Because of the func- 
tion of RAS#, this cycle is more common than the 
cycle discussed above. The row address is the same as 
in the previous cycle. Therefore, the RAS# signal is 
left active. 
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Figure 17. Burst Read DRAM Page. Hit 
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When a burst read starts with RAS# active, fewer 
clock, are required to complete the first access. This 
reduction improves performance. As a result, however, 
some timings become more critical. One of these is the 
time allowed to generate the burst address. 


The CAS# signals are asserted in the second T2 of the 
bus cycle. MBRDY # is also asserted at this time. To 
meet the address access time of the DRAMS, the burst 
address must be generated in the second T2. The rest of 
the read column address must also be available at this 
time. Two logic functions are needed to meet this tim- 
ing requirement. First, read and write address paths 
must be separate to allow the read address to be avail- 
able in the first T2. Second, the burst address path logic 
must latch the CPU A3 signal directly. In this way, the 
logic can generate the necessary address in time. The 
burst address state machine must track the state of A3 
at the begining of every cycle. The state machine func- 
tion is described in Section 5.11. 


The timing of KEN # must also be considered in this 
example. KEN# must be valid at the begining of the 
second T2 of the cycle. If it is not, the cycle will not be 
cached, and a 16-byte access can not be generated. If 
KEN # is active, a 16-byte burst access will be generat- 
ed, and the cycle will be cached as one: as KEN # is 
active in the second to last T2. 
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At first glance this timing may not appear critical. 
KEN # is a decode function, and decode is valid at the 
clock edge called for. The KEN# input to the CPU 
must be synchronized to clock, however. Since decode 
is not synchronous, a two-clock synchronizer delay is 
required, and this delay is the reason that KEN# is 
normally active in this example. 


From the time CAS # is activated, this cycle is exactly 
the same as in the previously described burst cycle. It is 
terminated when BLAST # is asserted, and MBRDY # 
is deasserted when BLAST # is sampled active. 


5.7 Write Cycles 


As described in Section 4.1, a posted or delayed write 
function is employed in this example to reduce write 
cycle latency. Latency is reduced since write cyles are 
overlapped with other cycles including other Write cy- 
cles or reads from the second level cache. Write cycles 
normally make up 70 percent of all cycles, and overlap- 
ping can increase performance accordingly. 


Figure 18 illustrates the posted write implementation. 
In this example cycles begin when RAS# is inactive. 
As with read cycles, this case is rare in practice. 
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Figure 18. Basic Write Cycle 
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The cycle begins like a read. The CPU drives ADS# 
active, and the decode is sampled. RAS # is activated if 
the cycle is in DRAM space. In the second T2 of the 
cycle, however, the latched version of W/R# (LW/ 
'R#) is sampled active at the rising edge of the second 
T2. In response, the control logic begins several write 
cycle functions at this clock edge. 


The CAS # state machine for the appropriate bank en- 
ters the write sequence. The MUXEN # and WE# sig- 
nals are asserted. MRDY # is also asserted, terminating 
the cycle at the CPU. The MUXEN*# signals activate 
the write address path. This address is not present at 
the multiplexor outputs, however, until the next clock 
at which the write pipeline register latches the write 
address. 


The write data is latched at the same clock edge. The 
write data registers are enabled by MRDY# which 
simultaneously terminates the CPU cycle. Note that 
data is latched ‘in both the bank 0 and bank 1 registers. 


The WEO# and WE1# signals are also both active. 
The CAS# signals determine which bank is written to. 
These signals are asserted within two clocks after 
MRDY #. This action completes the write cycle. Note 
that, while five clocks are required clocks are required 
to complete the cycle, the CPU cycle is terminated in 
three CLKs. The wait state is only required if RAS# is 
inactive at the start of the cycle. : 
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In Figure 18 the next bus cycle starts immediately after 
RDY# is sampled. In this case, CAS# is activated 
during the second clock of the next bus cycle. This 
overlap of cycles is similar to the pipelining- feature 
used by many processors except that the i486 processor 
bus is not involved in the posting function. All logic for 
this function is implemented in the memory controller. 


Figure 19 is a more typical i486 processor bus sequence 
which clearly illustrates the advantages of the posting 
technique. Four write cycles have occurred together 


_ without idle bus clocks occurring between cycles. Since 


all writes access the same DRAM row, RAS # is active 
throughout the sequence. = 


Without tie extra clock to activate RAS#, MRDY# 
can be asserted in the clock after ADS# is asserted. 
These cycles, therefore, have no wait-states. As before, 
the write cycle is not complete when MRDY # 1s as- 
serted but instead when CAS# is asserted two clocks 
after MRDY # to terminate the CPU bus cycle. 


At zero wait-states, each write cycle still requires four 
clock cycles. The last two clocks of each write cycle 
overlap with the next cycle. The net effect on the CPU 
bus is the same as a string of two-clock write cycles, as 
illustrated i in aay 19. 7 


The first write in this figure is to bank 0. The falling 
edge of CASO# clocks the data into the bank 0 
DRAM. This edge is denoted by W1 in the diagram. 
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Figure 19. Back to Back Write Cycles 
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CASO# is asserted in the same clock that MRDY # 
terminates the second write (W2), which accesses bank 
1. CAS1# is activated in the same clock as MRDY # 
for the third write (W3). 


The second and third writes happen to be to the same 
DRAM bank. As we see, no timing modification is re- 
quired in this case. Write cycles can be completed with 
zero wait states in either case. This is important since 
writes often occur in sequence on the i486 bus, but not 
necessarily to sequential addresses. Write posting sup- 
ports zero wait-state write cycles to sequential and non- 
sequential addresses. 


This fact is also important if the design is to be modi- 
fied. For example while, interleaved DRAMs may not 
be required in systems with a permanent second level- 
cache, the write posting technique may still be used in 


the system. The benefits of this technique still apply 
since write cycles may still be overlapped as described. 
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5.8 Consecutive Bus Cycles 


The DRAM control logic is optimized for write cycles, 
‘as warranted by the i486 processor’s bus characteris- 
tics. Over 70 percent of all cycles are writes. By em- 
ploying the posted write technique, system performance 
is increased. 


The posted write technique poses some special prob- 
lems, however. Page miss, refresh and consecutive 
write-read cycles require special consideration. We will 
begin by discussing the consecutive write-read case. 
Page miss and refresh cycles will be discussed in sec- 
tions 5.9 and 5.10. 


When a read cycle immediately follows a write, the 
read cycle must be delayed as illustrated in Figure 20. 
The read cycle is delayed to allow the write to com- 
plete. Only read cycles to DRAM, i.e. (cache misses) 
need be delayed. Cache hits and write cycles overlap 
easily because the cache is on the CPU side of the 
DRAM controller. 
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Figure 20. Consecutive Write-Read Cycle 
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Write cycles cannot overlap DRAM read cycles, how- 
ever, primarily because of data bus contention. The 
DRAMs used here have common data I/O pins. In this 
case read and write data paths cannot be active at the 
same time. ! : 


To prevent data bus contention, the first data access of 
the read is delayed. In Figure 20 the first read access is 
to the same bank as the write. In addition, the read 
cycle accesses the same DRAM row. Two functions are 
required to ensure that the write is completed. First, the 
write address must be held until CAS # is asserted. Sec- 
ond, the data mux outputs must not be enabled until 
the CPU tristates the bus. 


The first function is accomplished by the MUXEN # 
signals. The MUXEN# state machine tracks the 
CAS# function for the appropriate bank. When the 
write for that bank is complete, MUXEN # is deacti- 
vated. In this way, the read address path is not enabled 
until the CLK after CAS# becomes active. Normally, 
the read address would be valid in the first T2 of the 
read cycle; however it must be delayed one clock to 
allow the write complete. Note that if one or more idle 
CLKs intervenes between these cycles, no delay occurs. 


The second function is accomplished with the WIP # | 


signal which is active until all write cycles are com- 
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plete. A read cycle to either bank will be delayed if it 
immediately follows a write. The first access of the read 


_ is delayed by MBRDY#, which is not asserted until 


the WIP# signal is deasserted. 


WIP # is deasserted once all pending writes are com- 
plete. In Figure 20 the read cycle is delayed 3 CLKs by 
this signal; in other words, three additional wait-states 


are added. If a read does occur immediately after a 


write, the number of wait-states added will decrease by 
the number of idle CLKs between cycles. For example, 
if ADS# for the read is asserted three clocks after 
MRDY # for the write, MBRDY # will not be delayed. 


5.9 Page Miss Cycles 


As described previously, page miss cycles occur when 
the CPU generates a cycle which changes the DRAM 
row address. The RAS# signal must be deasserted to 
change the ROW address in the DRAMS. Any time 
RAS # is deasserted, it must remain high for the pre- 
charge time (tRP). A delay is added to every page miss. 
cycle to satisfy this requirement. 


For read cycles this function simply requires extra wait 


‘states as illustrated in Figure 21. 
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Figure 21. DRAM Page Miss-Read Cycle 
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The bus cycle starts with RAS # low or active. The row 
address generated by the CPU is different than in the 
previous cycle, and the row address comparator deas- 
serts HIT #. This signal is valid in the first T2. HIT # 
is sampled at the RAS# PLD at the rising edge of the 
second T2. In response, RAS # is immediately deassert- 
ed and held inactive for two clocks. This time satisfies 
the RAS# precharge requirement. 


Four wait states are added to process the miss cycle. 
These clocks are added to every read cycle which ac- 
cesses a new DRAM row. The delay is accomplished, 
again, with the MBRDY # signal. MBRDY# will not 
be asserted when RAS#¥ is inactive. Once RAS# is 
sampled active, MBRDY #¥ is asserted. From here, the 
cycle proceeds as described in section 5.7. 


Write miss cycles are more complex than read miss 
cycles, due mainly to the write posting technique. The 
added complexity results in lower latency than in a 


non-posted memory system, however. Figure 22 illus- 
trates how this improvement is achieved. 
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The write cycle in Figure 22 also begins with RAS# 
active. The HIT # signal is deasserted in the first T2 at 
the same time that MRDY# is asserted. MRDY # 
could be inhibited at this point to prevent write cycle 
termination. The wait-states added to meet RAS # pre- 
charge time would then be added to this cycle. Five 
wait states are required to meet the precharge time. 


The average number of write cycle clocks can be re- 
duced, however, if another method is used. MRDY # 
can be allowed to terminate the cycle. In this case, any 
necesary wait-states will be added to the next cycle. 


This method improves the average in two ways. First, 
some write miss cycles will not require wait- states. 
This is the case when the next cycle occurs four or 
more clocks after a write miss. In addition, wait states 
will be reduced when the next cycle occurs in two or 
three clocks. Second, three wait-states are required to 
complete the next cycle when it follows immediately as 
illustrated in Figure 22. 
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Figure 22. DRAM Page Miss-Write Cycle 
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The first cycle in this figure is a page miss. It is termi- | 


nated at the CPU without wait-states. Because HIT # is 
not active in the first T2, RAS# is deasserted. At this 
point, additional clocks are added to perform the miss 
function. Part of the time required for RAS# pre- 
charge is overlapped with the next cycle. The two clock 
overlap reduces the number of wait-states required in 
the next cycle. Pheretore, the average write cycle laten- 
cy is reduced. 


5.10 Refresh Cycles 


The CAS# before RAS# refresh function is used in 
this example. This function uses internal counters in 
the DRAM devices to generate the refresh address. 
When the CAS # input is activated prior to RAS*#, the 
internal counter is incremented. The output of the 
counter is then used as the address of the row to be 
refreshed. 
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Each refresh cycle refreshes one row of the DRAM 


array. The refresh cycles are distributed such that one 
occurs every 15.6 us, with every row being refreshed in 
8 ms. Refresh cycles are initiated by the RFRQ signal. 
This signal is activated every 15.6 ps by a counter. 


RFACK is asserted in response to RFRQ. This signal 
indicates that the DRAM controller is ready to per- 


form the refresh cycle. It also signals the counter circuit 


that RFRQ can be deasserted. 


The function of RFRQ and RFACK is very similar to 
that of the CPU’s HOLD and HLDA signals. RFRQ is 
sampled at the end of each cycle and during idle cycles. 
RFACK is activated in the clock after RFRQ is sam- 
pled, except immediately after write cycles. 


Again, the posted write function must complete before 
the refresh cycle begins. If WIP # is active when RFRQ 


is sampled, RFACK will not be immediately asserted. 


RFACK will be asserted after WIP # is deactivated as 
illustrated in Figure 23. 
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Figure 23. Refresh Timing Concurrent with Write 
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Another cycle can start between RFRQ and RFACK. 
The cycle start PLD tracks this case. GP # ~ will not 
be asserted for any cycle that starts during this interval. 
Once the refresh cycle is complete, this cycle can be 
started. 


6.0 CONTROLLER IMPLEMENTATION 


The functions described in the previous section are gen- 
erated by the control logic. The controller, as outlined 
in Section 4.0, is made up of several PLDs. These devic- 
es enerate the control signals described in Section 5.0. 
The function of the logic is determined by the state 
machine definition. These state machines are distribut- 
ed in the different PLDs of the controller. 


In this section, we will explore the implementation of 
the control logic. The discussion will focus on the state 
machine definition. Certain conventions are followed 
throughout the discussion. These conventions are based 
on the state machine compiler used to generate the 
PLD equations. This compiler uses the exclamation 
point (!) to indicate the low or ‘‘0”’ condition of a signal. 
It uses the number symbols (#) to indicate that the 
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signal is active low. For example, !ADS # indicates that 
the ADS signal is both low and active. The # symbol 
indicates that a signal is active when low. So symbol 
!ALD means that the ALD signal is not active. These 
symbols are used to indicate state transitions as shown 
in Figure 24. The state transition in Figure 24 depends 
on three signals: ADS #, ALD, and RAS#. The equa- 
tion indicates that if both ADS# and ALD are active 
or if RAS# is not ctive at the next clock edge. the 
transition from SO to $1 takes place. In the transition 
between SO and Sl, the Y# signal is activated. The 
definition of states indicates which outputs are changed 
in the transition. These conventions are used to de- 
scribe the control state machines in the next section. 
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Figure 24. State Transition Example 
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6.1 Cycle Tracking Logic 


The cycle tracking logic is contained in one PLD. The 
five state machines implemented in this PLD start and 
end DRAM cycles, control refresh timing and control 
the address registers. These state machines, along with 
the MRDY # state machine comprise the first level of 
control logic. All other control state machines depend 
on this first level to generate signals at the proper time. 


The signals generated by this PLD are the following: 


CIP# - Cycle in Progress 
ALD - Address Latch Disable 


MEMCS#+ 
IBRDY#+!BLAST#+ 
IMRDY#+ 

IBOFF# 


IADS #+CT 
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CT - Cycle Track 
RFACK - Refresh Ratiewieags 
PCHG- RAS Precharge count 


The primary cycle acing state eiehieg is shown in 
Figure 25. This state machine generates the CIP# and 
M# signals. CIP# indicates that the CPU has started a 
cycle. When it is active, the rest of the logic samples the 
CPU control and MEMCS # signals. If the current cy- 
cle is not to DRAM, it will be ignored and CIP # will 
be deactivated. 


IBOFF#*!PCHG+ICT 


— IRAS#*HIT#*IMRDY# 
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Figure 25. Cycle in Progress State Diagram 
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This function is defined by the SO and S1 states in Fig- 
ure 25. As shown, CIP# is activated when either 
ADS # or CT are sampled active. If the cycle is not to a 
DRAM address, the MEMCS*# signal will not be ac- 
tive in the next clock. In this case, CIP # is deactivated 
to wait for the next ADS#. If the cycle is to DRAM, 
CIP # stays active until the end of the bus cycle. The 
bus cycle is terminated by one of three circumstances. 
All write cycles are terminated with the MRDY # sig- 
nal. Read cycles are terminated by BRDY# and by 
BLAST #. The cycle can be aborted by BOFF#. Any 
of these three events causes CIP # to be deactivated (S1 
to SO). 


Two special cases are also handled by this state ma- 
chine. When AHOLD is active in the same clock as 
ADS #, MEMCS # is not valid. In this case, the CIP # 
signal is not activated until AHOLD is deasserted. The 
state machine remains in SO when AHOLD is active. 


The second case is a write miss cycle. During a write 
miss, CIP # must be active for the cycle to complete. 
CIP # is active in this case after MRDY # is returned 
to the CPU. Cycles that start during the time CIP # is 
active must be tracked by the CT state machine. The 
_M# signal indicates to the CT state machine that the 
cycles must be tracked. 


The state in which M# and CIP# are both active is S2. 
This state is entered when MRDY# and RAS# are 


active and HIT# is inactive. By using MRDY# to © 


qualify this transition, S2 is entered only during write 
cycles. Therefore, M# is only activated during write 
miss cycles. Note that any cycle will be recognized by 
the CT state machine when M # is active. 


ICIP#*M# 
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Figure 26. Cycle Tracking State Machine 
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The CT state machine is shown in Figure 26. This state 
machine tracks cycles that start while the CIP# state 
machine is busy. It tracks CPU cycles that start during 
refresh cycles as well as to the two cases mentioned 
above. 


This state machine tracks one cycle. Any cycle that 
starts while CIP # is busy is not terminated immediate- 
ly. The MRDY# and MBRDY*# signals are delayed 
until the previous cycle is finished. Therefore, anytime 
CT is active, there is only one cycle pending. 


CT is deactivated when the pending cycle is recognized 
by the CIP# state machine. This event is indicated by 
CIP # active and M# inactive. When this event occurs, 
the CT state machine transitions to SO deactivating CT. 


The ALD signal is also active only during DRAM cy- 
cles. Therefore, its state machine is very similar to that 
of CIP#. As with CIP#, ALD is asserted when 
ADS# is sampled active. If the cycle is not to a 


DRAM address, ALD is deasserted. When a DRAM 


cycle is terminated, ALD is also deasserted. The SO- to- 
S1 transition is quite similar to that of CIP #. 


The difference between the two state machines is re- 
vealed during write miss cycles. The S1-to-S2 transition 
is made if a write miss occurs. ALD must be held active 
during a write miss until RAS # is active. In this way 
the row address is held even if another cycle occurs. 
The combination of CIP # being active while PCHG is 
inactive indicates that RAS# will be active in this 
clock. ALD must be deactivated in this clock to allow 
the next address to be latched. ALD is re-activated if 
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Figure 27. Precharge State Machine 
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another cycle has started during the write miss process. 
CIP# and MEMCS# are satapied during SO for this 


purpose. 


} The PCHG state machine provides two functions. it 
determines the time RAS# is inactive during a miss or 
refresh cycle, and it determines the timing of refresh 
cycles. Figure 27 shows the state transitions of the 
PCHG state machine. Because the timing of this signal 
is not obvious, Figure 28 has been included. It shows a 
refresh cycle which occurs following a write cycle. 


After RAS# is active the PCHG signal is activated. 
State S1 is maintained then until RAS# is deactivated. 
‘RAS # is only deactivated during a miss or refresh cy- 
cle or, of course, if RESET is asserted. During a miss 
cycle the transition to SO is made deactivating PCHG. 
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RAS # is then activated, resulting in two CEM clocks 
of RAS# Dicenalse time. 


Sates S2 aiid S3 define the timing of refresh cycles. The 
transition to this sequence is made when RAS # is sam- 
pled inactive while EP active. EP indicates that the 


- RAS# state machine has entered the refresh sequence. 


RFACK # initiates the refresh sequence. It indicates 
that the control logic is ready to accept a refresh re- 
quest. The RFRQ signal is sampled at the end of a 
DRAM cycle or during idle clocks..Note that RFRQ 
cannot be recognized during a write miss. 


RFACK # is deactivated after RAS# is deactivated at 
the beginning of the refresh sequence (See Figure 27 
and Figure 28). 
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Figure 28. Refresh State-Timing Example 
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6.2 RAS# Logic 


The RAS# logic for both memory banks occupies one 
PLD. Four RAS# signals are generated: RASO#-— 
RAS3#. These signals are generated to divide loading. 
Their timing is identical. The state machine for RAS is 
relatively simple and is shown in Figure 29. 


States SO and S1 are used to implement RAS # function 
for normal cycles. After RESET, the state machine 
waits for the first bus cycle. The first bus cycle is sig- 
naled by the CIP# signal. When CIP#, MEMCS# 
and PCHG are sampled active, RAS# is asserted. 
RAS # stays active until a miss or refresh cycle occurs. 


A miss cycle is indicated when the HIT # signal is driv- 
en inactive. It is qualified by CIP# and MEMCS# 
being active. In this way, RAS# is only deactivated 
during DRAM cycles. 
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Once RAS# is deasserted during a miss cycle, it stays 
high until PCHG is sampled active. This function im- 
plements the RAS# precharge time. CIP# and 
MEMCS ¥ will still be active during read miss cycles. 
Therefore, RAS# will be asserted in the next clock. 
For write miss cycles the WIP # signal must be used to 
restart RAS#. With a write miss, a non-DRAM cycle 
can occur before RAS# is asserted. WIP # is the only 
valid indication that a DRAM cycle has occurred in 
this case. WIP # is combined with MEMCS #¥ to create 
the CSWIP# term which indicates a valid RAS# cy- 
cle. 


When a refresh cycle occurs, the RAS# state machine 
transitions to $2. S2 and S3 are devoted to the refresh 
function. When RFACK is sampled active, the tran- 
sition occurs. The refresh sequence shown in Figure 28 
illustrates the function of these two states. Note that 
after a refresh cycle, RAS# is left inactive. The tran- 
sition from SO to S4 allows for refresh cycles that start 


- when RAS # is inactive. 
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Figure 29. RAS State Machine 
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6.3 CAS# Logic 


Two separate PLDs implement the CAS# function. 
These PLDs generate the CAS# signals for bank 0 and 
bank. 1, respectively. The state machines which gener- 
ate these signals are separate and independent. Each 
generates two CAS# signals. CASOO# and CASO1# 
for bank 0, and CAS10# and CAS11# for bank1. 
These signals drive separate DRAM modules due to 
drive Requirements: 
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Figure 30 shows the state diagram for the bank 0 


CAS# function. The states on the left side of the dia- 
gram implement the write function. The states on the 
right implement the read function. As with RAS#, the 
state machine waits until CIP # indicates that a cycle 
has started. When CIP# is active, the state of the 
latched version of W/R# determines which sequence is 
started. 
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Figure 30. CAS State Machine 
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If the cycle is a read, S4 is entered. If the cycle is a 
write, LA2 is sampled to determine if the cycle is to 
bank 0. If LA2 is low, S1 is entered. Note that this 
function is the same for the bank 1 state machine. The 
only difference is the state of LA2, which starts the 
write sequence. 


During a write cycle, CAS# is held inactive until the 
clock after RDY# is asserted. The state machine also 
waits in S1 during a write miss cycle. CAS # is asserted 
during S2. In this state, several events can occur. First, 
the CPU may not start another bus cycle. Second, it 
may start a bus cycle other than a DRAM cycle. Third, 
it may initiate a read cycle, and fourth, it may begin a 
write cycle to bank 1. If any of these events occur, S1 is 
entered. If another write cycle starts to the same bank, 
however, S3 is entered. 


The case of sequential writes to the same bank involves 
S2 and S83 only. An unlimited number of write cycles 
can occur in the same bank. If the DRAM row is same, 
they will occur without wait-states. If a write miss oc- 
curs, RAS# will be deasserted, and the transition from 
S3 to S1 takes place. | 


AQO#+1AQ0#*(MEMCS#4+ILW/R)+ 
IAQO#*LW/R*LA2 


1AQ0#*!MEMCS#* 
LA2*LWR*IRFACK 


IRAS#*RDY# 


1AQO#*LA2* 
LW/R 
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During read cycles, the CAS# signals for bank 0 and 
bank 1 are activated at the same time. Therefore, the 
state machines enter S4 at the same clock. At this 
point, however, the state of LA2 determines which 
state machine enters S5. In S5, CAS# is deasserted to 
prepare that bank for the next access. If S6 is entered, 
the data from that bank has not yet been accessed. 
CAS# must be held active, in this case, until the data is 
sampled by the CPU. From S6, the next transition will 
be to S5 to continue the cycle, or SO to terminate the 
cycle. If this bank was accessed first, the cycle will ter- 
minate from this state. 


The read sequence is much simpler if static column 
mode DRAMs are used. The state sequence for static 
column mode is shown in Figure 31. The write se- 
quence in this diagram is exactly the same as for the 
page mode CAS# control logic. The read function, 
however, requires only two states. From SO, the tran- 
sition is made to S5 any time that a DRAM read cycle 
starts. Note that LA2 is not used to qualify this tran- 
sition. Therefore, the CAS# signals for bank 0 and 
bank 1 are active at the same time. 


RESET 


IAQO#*IMEMCS#*!HIT#*LW /R*!RASA*IRFACK+ 
RFACK*RAS# 


IBRDY#*!BLAST#* 
IRFACK#+RFACK#*RAS# 
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Figure 31. Static Column CAS State Machine 
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6.4 Write Control Logic 


The posted write implementation requires logic support 
for a few key functions. These functions are required 
mainly to support posting with interleaved memory. 
Three types of. signals are penetrated to ampiemen these 
funchions: 


Multiplexer Select - These signals control the address 
multiplexers when RAS# is active. During write cy- 
cles, they must be active to select the write address 
path. These signals stay active during read cycles which 
are immediately preceded by a write. They are deacti- 
vated, when the write cycle is complete. Once they are 
deactivated the read cycle may proceed as the read path 
is selected. 


Write Enable - These signals are combined with the 
byte enable CPU outputs (BEO# —BE3 #) to create the 
WBE# signals. The WBE00# —-WBEO03# signals con- 
trol which byte.is written in bank 0 during a write cy- 
cle. The WBE10# -WBE13 # signals pone the same 
function for bank 1. 


Write In Progress - This signal is active when a write 
cycle has been started by either DRAM bank. It is ac- 
_ tive when either CO1# or C11# is active. CO1#. and 
C11# are state outputs from the CAS# state machine 
which indicates that a write cycle is being performed. 


C01# is generated for bank 0 and Cl1# for bank 1. | 
WIP# is only required for interleaved memory sys- . 


tems. The CO1# (or C11 #) output would be sufficient 
for a non-interleaved (single bank). system. 


The state machines which generate these signals are 
_ shown in figure 32. The state diagram for the MENO# 
signal is shown. This signal enables the address multi- 


plexer for bank 0. MENO# is activated whenever a. 


write cycle occurs to an address with A2 low (0). The 
MEN 1 # function is the same except that it is activated 
when A2 is high (1). The AQO#, MEMCS# and LW/ 
R# signals are used to indicate a valid write cycle. 


The MEN # signals are deactivated when the write cy-. 


_cle is complete. The cycle is complete when CAS# for 
that bank is sampled active. For bank 0, C01 # is used 
to indicate that a write is in progress. MENO# is held 
active when CO1 # is active. When CASO0# is sampled 
active, CIP # is checked to determine if another valid 
write to the same bank has occured. If so. MENO# 
stays active until CASOO# is sampled active. This.func- 
tion keeps the write address path open during consecu- 
tive writes to the same bank. 


The WE# state machine is very similar to that of the 
MEN # state machine. When a write cycle starts, 
WEO0# is activated in the same manner as MENO#. 
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The write enable signals, however, must stay active one 
clock longer than the MEN# signals. Therefore, the 
WE# signal is not deacavated until CO1# is sampled 
inactive. 


WIP # is generated in part by combinatorial logic so 
that it can be active in the same clock as the CO1# and 
C11# signals. WIP# must be active in this clock to 
ensure that a write miss is completed before a refresh 
cycle takes place. WIP # must also be held active one 
clock after C01 # and C02# are sampled inactive. This 
timing ensures the proper sequence for subsequent read 
cycles. The logic equation and state machine for WIP # 


are shown in Figure 32. 


6.5 Burst Address Logic 


The burst address logic generates the BI1MAO and 
BOMADO signals. These signals are connected directly to 
the low order address inputs of the DRAMs. Because 


of the direct connection, these signals must perform 


several different functions. They must multiplex the 
low order row and column addresses, multiplex. the 


_ write and read addresses and generate the burst address 
during read cycles. 


These functions are performed separately for each bank 
. by two PLDs. Each PLD generates two identical sig- 


nals to reduce the drive requirements. These signals are 


‘connected directly to two bytes of the DRAM array. 
_ The signals are generated partly by combinatorial logic 
and partly by the state machine. 


The logic equations and state diagram for this function 
are shown in Figure 33. The state machine generates" 
the burst address for read cycles. The logic squations 
handle the multiplexing functions. 


The burst address is generated after a burst read cycle 
has started. Note that the i486 CPU cache need not be 
enabled for burst cycles to occur. Cycles such as 64-bit | 
floating-point operand reads will burst if BRDY 1s re- 
turned to the processor. SO and S3 track the state of the 
A3 CPU address output. When a burst read cycle 
starts, S1 or S2 is entered. The BOMAO address output 
will then change its state when MBRDY # and DATA- 
SEL are both low. This function is the burst address for 
bank 0. The B1MAO address output changes its state 
when MBRDY ¥ is low and DATASEL is high. This 
function is the burst address for bank1. The only differ- 
ence in the two PLDs is the value of DATASEL used 
to determine the time of which the burst address chang- 
es its state. ; 


5-244 


intel AP-447 


WEO# $0 | 1 
$110 


RESET 


LWIP# 1CASO1 # ICIP#*LW/R*IMEMCS#*!LA2 


so 
$11 0 


1CO1*CASO1 #* 


ICIP #*LW /R*IMEMCS#*LA2*!CASO1 # WEO# 


so 1 
$11 0 


1CO1 +1011 


(50) RESET 
Cau 


1CO1#1C11 ICIP#*LW/R*!IMEMCS#*!LA2 


IWIP#=!LWIP# + !CO1 + !C11 


1C01 
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Figure 32. State Machines for MENO#, WIP #, and WE0# 


IALD*IA3*AQO# !ALD*!A3*AQO# 


IAQO*ILW/R#* 
IMEMCS#*HIT# 


IBRDY#*BLAST# IBRDY#*BLAST# 
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IBOOMAO = !WEO# * 1LA313 + WEO# * RAS# * !ILA313 * WEO# * IRAS# * IBOA 
RAS Precharge and Refresh Counter 


Figure 33. Burst Address Generation 
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The SO and S3 states are required only to ensure that 
the burst address outputs are valid during the T2 of any 
read cycle. Figure 17 shows the timing of a burst read 
hit cycle. In the first access of this cycle, the burst ad- 
dress must be valid in the first T2 to satisfy the address 
access time requirements of the DRAM. The value of 
A3 is sampled with ALD to statisfy this requirement. 
In this way, the burst address state machine always 


starts from the correct value of A3. If another wait . 


state is added to this access, this function’ is not re- 
quired. 


The logic equations which provide the multiplexor 
function are very simple. The first term of the equations 
shown in Figure 33 enables the write path. The write 
enable signals are used to enable this path. When WEO 
is active, for example, the value of the multiplexor out- 
put is passed through to the DRAM. The second term 
allows the row address A13 to be passed to the DRAM 
during a read page miss. This term is also qualified by 
the write enable signals. In this way, the write address 
is not disabled early during a read miss. The third term 
enables the burst address output from the state machine 
onto the address pins. 


7.0 SUMMARY 


We have discussed an example memory subsystem for 


the i486TM CPU. The material has been presented as a 
design guide for systems under development or as an 
optimization for existing systems. We have discussed 
several key functions which will be summarized in this 
section. We will also discuss some important timing 
restrictions. The key functions discussed include an ex- 
ternal or second level cache, posted write cycles, and 
interleaved DRAM banks. | 


' The interleaving technique is used to support the burst 
bus feature of the 1486 CPU. The use of this technique 


allows the DRAM to supply a DWORD every clock | 


during burst cycles. Interleaving proves to be very use- 
ful in i486 CPU memory designs. Without its use 


DRAM timings such as tPC (Page Mode Cycle time) 
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and tCP (CAS Precharge time) would prevent zero. 
wait state access at 33 MHz. 


Data registers are also used to improve average write 
cycle latency. These registers hold write data during 
posted write cycles. Write posting can improve average 
write latency to under 3 clocks for many applications. 
This improvement is important in 1486 CPU based sys- 
tems because 65% to 70% of all bus cycles are writes. 
Without using a latency improvement technique such 
as write posting average write latency will be above 5 
clocks. | 


The write posting technique also improves memory per- 


formance in other ways. Write cycles, particularly 


DRAM page misses, can be overlapped with read hit 


cycles in the second level cache. This fact greatly reduc- 
es the delay caused by read cycles which immediatly 
follow write cycles. | 


Analysis of this memory subsystem design has shown 
that use of these features has resulted in a low latency 
response to the CPU. Over several important applica- 
tions the following characteristics have been recorded. 
The average clock cycles required to complete the first 
read is 3.5 clocks. Subsequent cycles of a burst are al- 
ways processed in one clock. Write cycles average 2.5 
clocks. These average counts result from the following 
DRAM access rates. Read accesses from the cache al- 
ways occur in zero wait states. oe 


Table 3. Dram Function Latencies 
DRAM | Subsequent 
: Burst 
Function 


Accesses 


/PageHit | 3 


NOTE: 
*Write miss latencies occur only during cycles subsequent 
to a write miss cycle. 
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7.1 Timing Restrictions 


A few DRAM timing restrictions must be mentioned. 
_ These timings become critical at 33 MHz. These tim- 
ings are critical due primarily to the latency of the first 
cycle of a read page hit. Since three clocks are used the 
following timing restrictions exist. 


tRAC = Data access time from RAS#¥ active 


tCAA Data access time from column address valid 
tCAC = Data access time from CAS#¥ active 


tRP = RAS# precharge time 


At 33 MHz 
tRAC = 71.5 ns 
tCAA = 37.5 ns 
tCAC = 34 ns 
tRP = 60.6 ns 
At 25 MHz 


tRAC = 101.5 ns 


tCAA = Sins 
tCAC = 61.5 ns 
tRP = 80 ns 
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APPENDIXA ~— 
PLD CODES AND SCHEMATICS | 


A.1 PLD DEVICES 


Many design examples in this manual use PLDs (Pro- 
grammable Logic Devices) which can be programmed 
by the user to implement random logic. A PLD device 
can be used as a state machine or a signal decoder, for 
~ example. The advantages of PLDs include the follow- 

ing: | 

1. PLD pinout is determined by the designer, which 
can simplify board layout by moving signals as re- 
quired. 


2. PLDs are inexpensive as compared to dedicated bus - 


controllers. . 


Intel EPLDs (Erasable Programmable Logic Devices) 

have the following additional advantages: 

1. Programmability/erasability allows EPLD func- 
tions to be changed easily, simplifying prototype de- 
velopment. 


2. Since EPLDs are implemented in CMOS technolo- 


gy, they can consume an order of magnitude less. 


power than bipolar PLDs. Power-conscious applica- 
tions can benefit greatly from using EPLDs. 


3. Since the EPROM cell size is an order of magnitude 
smaller than an equivalent bipolar fuse, EPLDs can 


implement more functions in the same package. | 


This higher integration can result in a lower overall 
component count for a design. The added flexibility 
can also mean that an extremely low number of 
“raw” (unprogrammed) devices need to be stocked 
versus bipolar PLDs. 


4. Once an EPLD design has been tested, plastic OTP 


(One-Time Programmable) versions of the device 
can be used in a production environment. 


PLDs have the following tradeoffs: 


1. Most PLDs do not have buried (not connected to 
outputs) registers. For some state machine applica- 
tions, this means using an otherwise available output 
pin to store the current state. 


2. The drive capability of CMOS EPLDs may be insuf- 
ficient for some applications. While the trend is 
towards use of CMOS throughout a system, in cases 


where high current levels are required, some addi- 
tional buffering may be required with EPLDs. | 


A PLD consists logically of a programmable AND ar- 
ray whole output terms feed a fixed OR array. Any 
sum-of-products equations, within the limits of the 
number of PLD inputs, outputs, and equation terms, 
can be realized by specifying the correct AND array 
connections. Figure B-1 shows an example of two PLD 
equations and the corresponding logic array. Note that 
every horizontal line in the AND array represents a 
multi-input AND gate; every vertical line represents a 
possible input to the AND gate. An X at the intersec- 
tion of a horizontal line and a vertical line represents a 
connection from the input to the AND gate. 


- The sum-of-products is then routed to a configurable 


macrocell. The macrocell in Figure B-2 can be config- 


- ured as a combinational output or registered output. 


The output can be active high or active low. A separate 
AND term controls the output buffer. 


Designing with PLDs consists of determining where Xs 
must be placed in the AND array and how to configure 

the macrocell. This task is simplified by logic compil- 

ers, such as iPLS II (Intel’s Programmable Logic Soft- 

ware II) or ABEL. Logic compilers accept input in the 

form of sum-of-product equations and translate the in- 

put into a JEDEC programming file that can be used 

by programming hardware/software. 


Intel PLDs are described in the Programmable Logic 
Handbook. Three Intel PLDs have been used in this 
manual to implement state machine and decode func- 
tions. These PLDs include: 


@ 85C220—fast 20-pin superset of 16 x 8 type bipolar 
and CMOS PLDs. 


@ 85C224—fast 24-pin superset of 20 x 8 type bipolar 
and CMOS PLDs. 


e 85C508—fast address decode PLD with integral 
transparent latches. 


The 85C220 and 85C224 PLDs are both available at 
clock speeds to support fast state-machines in 1486 sys- 
tems. The 85C508 provides a fast Enable-to-Output 
time with a minimal system setup time. 
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BOOLEAN EQUATION: 
D=A*S*/8 
+/A*/S*B 
EPLD IMPLEMENTATION: 


CLOCK 
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IA 4S /B 
ATS B 
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i *< 


COMBINATIONAL 
OR REGISTERED 
(SELECTABLE) 
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PRODUCT $ 
TERMS 


MACROCELL 
REGISTER 


INVERT 
CONTROL 


FEEDBACK 


<4 V0 TYPE 
: SELECT 


Figure A-2. 85C220/85C224 EPLD Macrocell Architecture 
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module SC_MODE_DRAM_CTRL_4 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 4, INTEL CORPORATION’ 
“ This pld generates MRDY and MBRDY 
‘“ Implemented with Intel 85C224 EPLD. 


SCk device ‘E224’; 

X = .&. ‘“ ABEL ‘don’t care’ symbol 

Cc = ..; “ ABEL ‘clocking input’ symbol 
‘ Inputs 

CLK pin 1; ‘‘P4 input CLK” 

M~ pin 2; “Miss Indicator 

CIP ~ pin 3; “Cycle OK 

MEMCS~ pin 4; “Latched A2. 

HIT ~ pin 5; “DRAM Page Hit Signal 

RFACK pin 6; “Refresh acknowledge’”’ 

ADS ~ pin 7; “CPU ADS~ 

W_R pin 8; “CPU W/R 

RESET pin 9; “System Reset 

dum1 pin 10; “Write in progress 

BOFF ~ pin 11; “CPU Backoffinput __ 

WIP ~ pin 14; “CPU Burst Last output 

CAS ~ pin 15; “Row Address strobe 

BLAST~ pin 22; 

RAS ~ pin 23; “Any CAS# signal 
‘“ Output 

dum0 pin 16; 

MT pin 17; ‘‘ BRDY state miss tracking 

MRDY ~ pin 18; ‘‘ Memory RDY (modified with other RDYs) 

DALE ~ pin 19; ‘‘ Decode Latch enable 

LWR pin 20; “ Internally latched W/R# for rdy 

BRDY ~ pin 21; ‘ Processor BRDY ~ 


state_diagram [MRDY ~] 


state [1]: if (IRFACK & IADS~ & WR & IRAS~ & M~) # (ICIP~ & LWR& 
IMEMCS~ & !RFACK & M~) then [0] else [1]; 
state [0]: goto [1]; 


state_diagram [BRDY ~, MT] 


~ state [1, 1]: if ICIP~ & IHIT~ & IMEMCS~ & ILWR & IRFACK & WIP~ & !RAS~ 


then [0, 1] else if ICIP~ & IMEMCS~ & HIT~ & ILWR # 
ICIP~ & IMEMCS~ & RAS~ & !LWR then [1, 0]; 
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state [1, 0]: if RESET then [1, 1] else 


if WIE & IRFACK & ICAS~ then [0, 1]; 


state [0, 1]: if RESET # !1BOFF ~ # IBLAST - ~ then [1, ne else [0, 1]; 


state_diagram [DALE ~] 


state [0]: if RESET then [0] else 


if ADS ~ then [1] else [0]; 


state [1]: if RESET # !BOFF~ then [0] else 


if !CIP~ then [0] else [1]; 


State_diagram [LWR] 


state [0]: if RESET then [0] else 


if !ADS~ & W_R then [1] else [0]; 


state [1]: if RESET # !BOFF~ then [0] else 


if !ADS~ & !W_R then [0] else [1]; 


test_vectors 


([CLK,M ~ ,CIP~ ,MEMCS ~ HIT ~~ ,RFACK,ADS ~,W_R,RESET,WIP ~ BOFF ~ BLAST ~] 
—> | [RAS~,MRDY ~,DALE ~ ,LWR,BRDY ~ }) 


“CMAMHRAWRWBBR MDLB 
“L~QEIFD_EIOLA RAWR 
“K OMTASRSPFAS DLRD 


~C~F~ E FS~ YE Y 
SK Gf 27 «© 2&2 


~~ ~ 


[C, x, X, X, X, X, 1, X, 1, x, X, x, xX] — > [x, x, x, x]; 

[c, x, 1, 1, x, xX, 1, x, 1, x, xX, x, x] — > [1, 0, 0, 1]; 
[c, 1, 1, 1, x, 0, 1, x, 0, 0, 1, 1, 1] — > [1, 0, 0, 1]; 
[c, 1, 1, 1, x, 0, 1, X, 0, 0, 1, 1; 1] oe as {1, 0, 0, 1]; 
[c, 1, 1, 1, x, 0, O, 1,0, 0, 1, 1, 1] — > [1, 1, 1, 1]; 
[c, 1, 0, O, 0, O, 1, x, O, O, 1, 1, 1] — > [0, 0, x, 1]; 
[c, 1, 0, O, 0, 0, 1, x, O, 1, 1, 1, O] — > [1, 0, 1, 1]; 
[c, 1, 1,0, 0, 0, O, 1,0, 1, 1, 1,0] — > [0, 1, 1, 1]; 
_{c, 1, 0, O, 0, O, 1, x, O, 1, 1, 1, 0] — > [1, 0, x, 1]; 
[c, 1, 1, x, 0, O, O, 1,0, 1,1, 1, O] — > [0, 1, 1, 1]; 
[c, 1, 0, 0, 0, 0, Fe.4 0, AG 1 1, 0] — [1, 0, X, 1]; 
[c, 1, 1, x, 0, O, 1, x, 0, 1, 1, 1, 0] —> [1, 0, x, 1]; 
[c, 1, 1, 1, x, 0, 1, x, 0, 1, 1, 1, 0] —> [1, 0, x, 1]; 


5-252 


240799-34 


nto! AP-447 


module SC_MODE_DRAM_CTRL3 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 3, INTEL CORPORATION’ 
“ This PLD generates RAS 
“ Implemented with the Intel 85C220 EPLD. 


SC3 device ‘E0320’; 


X =  .X.; ‘“ ABEL ‘don’t care’ symbol 
= ,; ‘ ABEL ‘clocking input’ symbol 


CLK pin 1; “P4 input CLK” 


M~ pin 2; “Refresh Acknowledge 
CIP ~ pin 3; “Cycle OK 
MEMCS~ pin 4; “Latched A2. 
HIT ~ pin 5; “DRAM Page Hit Signal 
RFACK pin 6; ‘“Backoff input to P4’’ 
PCHG pin 7; “RAS precharge count 
WIP ~ pin 8; “Write in Progress 
RESET pin 9; “System Reset 
Q1 pin 12; ‘RAS refresh count 

“ Output 
RAS2 ~ pin 13; “ 
RAS1 ~ pin 14; “ RAS byte 0,2 
EP pin 15; “ state variable 
EP1 pin 16; “ state variable 
RASO ~ pin 17; “ RAS byte 1,3 
RAS3 ~ pin 18; “ 


CSWIP~ pin 19; “ 
state_diagram [RASO ~ ,RAS1 ~ ,EP] 


state [1, 1, 0]: if RESET then [1, 1, 0] else 
if !ICIP~ & ICSWIP~ & !PCHG then [0, 0, 0] else 
if RFACK & WIP~ then [1, 1, 1] else 
[1, 1, O}; 


state [0, 0, 0]: if RESET then [1, 1, 0] else 
if RFACK then [0, 0, 1] else 
if !CIP~ & HIT~ & !MEMCS ~ then [1, 1, 0] 
else [0, 0, 0]; 


state [0, 0, 1]: if RESET then [1, 1, 0] else 
if RFACK & !PCHG then [1, 1, 0] else 
if RFACK & !WIP~ # !RFACK & PCHG then 
[0, 0, 1] else if RFACK & WIP~ & !Q1 then [1, 1, 1]; 


state [1, 1, 1]: if RESET then [1, 1, 0] else 


if |!PCHG then [0, 0, 1] else [1, 1, 1]; 
: 240799-35 
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state [0, 1, 0]: 
state [0, 1, 1]: 
state [1, 0, OJ: 
state [1, 0, 1]: 
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goto [1, 1, 0]; 
goto [1, 1, 0]; 
_ goto [1, 1, OJ; 
goto [1, 1, 0]; 


state.diagram [RAS2 ~ ,RAS3 ~ ,EP 1] 


state [1, 1, 0]: 


state [0, 0, O]: 


state [0, 0, 1]: 


state [1, 1, 1]: 


state (0, 1, O]: 
state [0, 1, 1]: 
state [1, 0, 0]: 
state [1, 0, 1]: 


equations 
ICSWIP ~ 


test.vectors 


if RESET then [1, 1, 0] else 

if ICIP~ & ICSWIP~ & !PCHG then [0, 0, 0] else 
if RFACK & WIP~ then [1, 1, 1] else 

[1,.1, O]; 


if RESET then [1, 1, 0] else 

if RFACK then [0, 0, 1] else — 

if CIP~ & HIT~ & IMEMCS ~ then [1, 1, 0] 
else [0, 0, O]; 


if RESET then [1, 1, 0] else 

if |RFACK & !PCHG then [1, 1, 0] else 

if RFACK & !WIP~ # IRFACK & PCHG then 

[0, 0, 1] else if RFACK & WIP~ & !Q1 then [1, 1, 1]; 


if RESET then [1, 1, 0] else 
if IPCHG then [0, 0, 1] else [1, 1, 1]; 


goto [1, 1, 0]; 
goto [1, 1, 0]; 
- goto [1, 1, 0]; 
goto [1, 1, 0]; 


= (IMEMCS~ # IWIP~)& IRESET; 


([CLK,M ~ ,CIP ~ ,MEMCS ~ HIT ~ ,RFACK,PCHG, WIP ~ fey ,RESET] - > 
[RASO ~ ,RAS1 ~ ,EP,RAS2 ~ RASS ~ ,EP1]) 


CMAMHRPWQR RRERRE 


L~QEIFCI1E AAPAAP 
K OMTAHP S SS SS1 
~C~CG~ E 01 23 
S K T 
[C, x, X, X, X, X, 1, X, X, 1] — > [x, x, x, Xx, x, x]} | 
[C, xX, X, X, X, X, 1, X, X, 1] — > [1, 1, 0, 1, 1, O}; 
[C, x, X, X, X, X, 1, x, x, 1] — > [1, 1, 0, 1, 1, 0]; 
X, X, X, X, X, 1, X, X, 1] — > [1, 1, 0, 1, 1, 0]; 
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end SC_MODE_DRAM_CTRL_3; 
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module SC_MODE.DRAM-CTRL1 


title 


“ Cycle Tracking Logic 
‘ Implemented with Intel 85C224 EPLD. 


SCy device ‘E224’; 
Xx = X.; 

Cc = Gi 
Inputs 


CLK pin 1; “P4 input CLK’. 


BLAST ~ 
MEMCS ~ 
AHOLD 
HIT ~ 
BOFF ~ 
ADS ~ 
RFRQ 
RESET 


- BRDY~ 


6 


MRDY ~ 
RAS ~ 
EP 
Output 


RFACK~ 
CIP ~ 


adist ~ 


pin 


pin 


AP-447 


flag ‘-r4’ 


‘“ ABEL ‘don’t care’ symbol : 
“ ABEL ‘clocking input’ symbol — 


2; “P4 BLAST output 


pin 3; “Memory Chip Select 


pin 


4; “Address HOLD input to P4” 


pin 5; “DRAM Page Hit Signal 
pin 6;Backoff input to P4” 


7; “Address Status output of P4” 


pin 8; “Refresh Request Signal 


pin 


9; “System Reset 


pin 10; ‘Processor burst ready pin. 
pin 11; “Memory ready 
pin 14; “Row Address Strobe 


pin 23; ‘Refresh indicator - count on RAS~ low 


pin 15; “Refresh acknowledge 
pin 16; ‘‘ ADS~ active indicator 


pin 17; ‘““ AQO~ Miss state indicator 
pin 18; ““ AHOLD with ADS ~ indicator 


pin 19; “ Precharge state indicator 
pin 20; “‘ Precharge state indicator 
ALD pin 21; “ Address Latch Disable 


state_diagram [CIP~, M~] 


. pin 22; ‘‘ ADL state variable 


state [1, 1]: if RESET then [1, 1] else 


state [0, 1]: 


state [0, 0]: 


if AHOLD # !RFACK~ # EP then [1, 1] else 
if !ADS~ # CT then [0, 1] else [1, 1]; 


if RESET # IBOFF~ # MEMCS~ 


if RESET # !BOFF~ then [1, 1] else 
if !PCHG & (CT # !ADS~) then [0, 1] else 
if IPCHG & !CT then [1, 1] else 


then [1, 1] else 


if HIT~ & !RAS~ & IMRDY~ then [0, 0] else 
If ((MRDY~ # (!BRDY~ & !BLAST~)) then [1, 1] 
else [0, 1]; 


[0, 0]; 


5-256 


‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 1, INTEL CORPORATION 


240799-38 


ntal AP-447 


state [1, 0]: goto [1, 1]; 
state_diagram [PCHG, Q1] 
state [0, 0]: if RESET then [0, 0] else 
if RAS ~ then [1, 0] else 
if RAS~ & !RFACK~ then [0, 1] else [0, 0]; 
state [1, 0]: if RESET then [0, 0] else 
if RAS~ & !EP then [0, 0] else 
if RFACK~ & EP & !RAS~ then [1, 1] else 
if RAS~ & EP then [0, 1] else [1, 0]; 
state [0, 1]: goto [1, 0]; 
state [1, 1]: goto [0, 0]; 
state_diagram [CT] 


state [0]: if RESET then [0] else | 
if !ADS~ & (AHOLD # !RFACK~ # !M~ # EP) then [1] else [0]; 


state [1]: if RESET # !BOFF ~ then [0] else 
if ICIP~ & M~ then [0] else [1]; 


state_diagram [RFACK~ ] 


state[1]: if RESET then [1] else 
if !CIP~ & RFRQ & IMRDY~ & !HIT~ then [0] else 
if {CIP~ & RFRQ & (IBRDY~ & !BLAST~) # 
RFRQ & CIP~ & ADS~ then [0] else [1]; 


state[0]: if RESET # !BOFF ~ then [1] else 
if RAS~ then [1] else [0]; 


state_diagram [ALD, adlst~] 


state [0, 1]: if RESET then [0, 1] else 
if !ADS~ # !CIP~ & !MEMCS~ then [1, 0] else [0, 1]; 


state [1, 0]: if RESET then [0,1] else 
if !CIP~ & MEMCS ~ then [0, 1] else 
if HIT~ & IMRDY~ then [1, 1] else 
if !HIT~ & !IMRDY~ then [0, 1] else 
if !BRDY~ & !BLAST~ then [0, 1] else [1, 0]; 


state [1, 1]: if RESET then [0, 1} else 
7 if !CIP~ & (IPCHG # MEMCS~) then (0, 1] else [1, 1]; 


state [0, 0]: goto [0, 1]; 
test.vectors 
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end SC_MODE_DRAM.CTRL 1: 


nto! AP-447 


module SC_MODE_DRAM.CTRL.7 | flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 7, INTEL CORPORATION’ 
‘This PLD generates DATASL and WE 
‘“ Implemented with the Intel 85C220 EPLD. 


SC7_ device ‘E0320’; 


X.; ‘“ ABEL ‘don’t care’ symbol 
‘ ABEL ‘clocking input’ symbol 


7?) 
ol 


“ Inputs 


CLK pin 1; “P4 input CLK’”’ 

BRDY ~ pin 2; “Burst Ready 

CIP ~ pin 3; “Cycle OK 

MEMCS~ pin 4; “memory select 

LA2 pin 5; “Latched A2. 

CASOO~ pin’ 6; “CAS output Bank1 
CAS10~ pin. 7; “CAS output Bank1 
LW_R pin 8; “CPU W/R latched ~ 
RESET pin 9; “System Reset 
BLAST~ pin 12; “CPU BLAST ~ output 


BOFF ~ pin 13; “CPU Backoff input 
HIT ~ pin 19; 
“ Output 
DATASEL pin 14; ‘‘ Bank select for reads 
RS~ pin 15; “ state variable 
RALE ~ pin 16; “ state variable 
WE ~ pin 17; “ Write Enable posted writes 
BSEL pin 18; “ Selects read or write data path 


state_diagram [DATASEL, RS ~] 


state [1, 1]: if RESET then [1, 1] else 
if ICIP~ & !LA2 & ILW_R & IMEMCS ~ then [0, 0] else 
if !CIP~ & LA2 & ILW_R & IMEMCS ~ then [1, 0] else [1, 1]; 


state [1, 0]: if RESET # IBOFF~ # (IBRDY~ & !BLAST~) then [1, 1] else 
if 1BRDY~ & BLAST~ then [0, 0] else [1, 0]: 


state [0, 0]: if RESET # IBOFF~ # (IBRDY~ & !BLAST~) then [1, 1] else 
if |BRDY~ & BLAST ~ then [1, 0] else [0, 0]; 


state (0, 1]: goto [1, 1]: 
state_diagram [WE ~] 
state [1]: if RESET then [1] else , 


if LW_R & !CIP~ & IMEMCS~ then [0] else [1]; 
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state [0]: if RESET # IBOFF~ then [1] else - 
| f LW.R & ICIP~ & IMEMCS~ then [0] else 
if CAS00~ + CAS10~ then [1]; 


state_diagram [RALE ~ } 


state [0]: if RESET then [0] else 
if ICIP~ & HIT ~ & IMEMCS ~ thon [1] else [0]; 


State [1]: if RESET # IBOFF ~ then {o] else 
if !HIT~ then [0] else [1]; 


end SC_MODE_DRAM_CTRL 7; 
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module SC_MODE_DRAM_CTRL_11 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 11, INTEL CORPORATION’ 
This PLD generates the mux enables write enables and WIP# 
‘ Implemented with the Intel 85C220 EPLD. 


SCw_ device ‘E0320’; 


x = X,; ‘ ABEL ‘don’t care’ symbol 
Cc = C,; “ ABEL ‘clocking input’ symbol 


‘ Inputs 


CLK pin 1; “P4 input CLK’”’ 

LA2 pin 2: ‘‘Latched A2. 

CIP ~ pin 3; “Cycle OK 

MEMCS~ pin 4; “Memory Chip select. 
RESET pin 5; “DRAM Page Hit Signal 
LW_R pin 6; ‘latched CPU W/R# 
C01 pin 7; “Write indication BankO 
CASO1~ pin 8; “ 

C11 pin 9; “Write indication Bank1 
CAS11~ pin 19; “ 


“ Output 
WIP ~ pin 12; “New Wip signal comb 
MENO ~ pin 13; ‘“Mux enables 
WE0 ~ pin 14; “ 
LWIP ~ pin 15; ‘“Latched WIP ~ 
dum pin 16; “ 
WE1 ~ pin 17; “ | 
MEN1 ~ pin 18; “ Mux enable Bank1 


state_diagram [WE0~] 


state [1]: if RESET then [1] else 
if !CIP~ & LW_R & IMEMCS~ & !LA2 then [0]; 


state [0]: if RESET then [1] else 
if !CO1 then [0] else 
if CO1 then [1]; 
-state_diagram [WE1 ~] 


state [1]: if RESET then [1] else 
if !CIP~ & LW_R & IMEMCS~ & LA2 then [0]; 


state [0]: if RESET then [1] else 
if 1011 then [0] else 
if C11 then [1]; 


state_diagram [LWIP ~] 
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state [1]: if !CO1 # !C11 then [0] else [1]; 


state [0]: ifRESETthen[iJelse 
if 1CO1 # !C11 then [0] else iy: 


state_diagram [MENO~] 


state [1]: if RESET then [1] else 
if !CIP~ & LW-R & IMEMCS ~ & !LA2 then [0]; 


state [0]: if RESET then [1] else 
if !CO1 & CASO1~ then [0] else 
if !CIP~ & LW_R & IMEMCS~ & !ILA2 & !CASO1 ~ then [0] else 
if !CASO1 ~ then [1]; 
state_diagram [MEN1 ~] 


state [1]: if RESET then [1] else | 7 
if ICIP~ & LW_R & IMEMCS~ & LA2 then [0]; 


state [0]: if RESET then [1] else 
if 1011 & CAS11~ then [0] else | 
if ICIP~ & LW_R & IMEMCS~ & !LA2 & !CAS11~ then [0] else 
if 1CAS11~ then [1]; 
equations 
IWIP~ = ILWIP~ # !CO1 # !C11; 


“test.vectors 


“([CLK,M 10~ CIP ~ MEMCS ~ HIT ~ RFACK,ADS ~ ,W_R,RESET,CASO~ BOFF ~,BLAST~] 
“—>  [RAS~,MRDY ~ DALE ~ ,LWR,BRDY ~]) | 


“CMAMHRAWRCBBR MDLB 
“L_-QEIFD_EAOLA RAWR 
“KIOMTASRSSFAS DLRD 

“ 0~C~F~ EOFS~ YE Y 

7 S$ K T~ ~T ~~ =~ 

"“ [C, x, xX, X, X, x, 1, x, 1, X, x, x, x] — > [x, x, x, x]; 

“ [c, x, 1, 1, x, xX, 1, %, 1, x, X, x, x] —> [1, 0,0, 1]; 
“ [c, 1, 1, 1, x, 1, 1, x, 0, 1,0, 1, 1] — > [1, 0, 0, 1]; 
“ [c, 1, 1, 1, x, 1, 1, x, 0, 1, 0,1, 1] -— > [1, 0, 0, 1]; 
“[c, 1, 1, 1, x, 1,0, 1, 0, 1,0, 1, 1] — > [1, 1, 1, 1]; 
“ [c, 1, 0, 0, 0, 1, 1, x, 0, 1, 0, 1, 1] ~ > [0, 0, x, 1]; 
“ [c, 1, 0, 0, 0, 1, 1, x, O, 1, 0, 1,0] — > [1, 0, x, 1]; 
" [c, 1, 1, 0, 0, 1, 0, 1, 0, 1, 0, 1, 0] aad 0, 1, 1, 1); 
“ [c, 1, 0, 0, 0, 1, 1, x, 0, 0, 0, 1,0] — > [1, 0, x, 1]; 
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module | SC_ MODE. DRAM_CTRL_11 flag ‘ 14’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 11, INTEL ‘CORPORATION 3 
., Ubis PLD generates the mux enables write enables and WIP# 


86 law mtanaAl santh thea lrntal CRAAN Fraime 
implemented Wil wic Wiel 850226 Lruw. 


SCw device ‘E0320’: 


Xx = xX, ‘ ABEL ‘don’t care’ symbol 
Cc = .C,; “ ABEL ‘clocking input’ symbol | 
“ Inputs 


~ CLK pin 1; “P4 input CLK” 
LA2 pin 2; “Latched A2. 
CIP ~ pin 3; “Cycle OK 
MEMCS~ pin 4; “Memory Chip select. 
RESET pin 5; “DRAM Page Hit Signal 
LW_R pin 6; “latched CPU W/R# 
C01 pin 7; “Write indication BankO 
CASO1~ pin 8; “ 
C11 pin 9; ‘Write indication Bank1 
CAS11~ pin 19; “ 


“ Output 
WIP~ —__ pin 12; “New Wip signal comb 
MENO ~ pin 13; “ Mux enables 
WE0~ pin 14; “ 
LWIP ~ pin 15; ‘“Latched WIP ~ 
dum pin 16; “ 
WE1 ~ pin 17 
MEN ~ pin 18; “ Mux enable Bank1 


state_.diagram [WE0~] 


state [1]: if RESET then [1] else 
if ICIP~ & LW_R & IMEMCS~ & !LA2 then [0]; 


state [0]: if RESET then [1] else 
if !CO1 then [0] else 
if CO1 then [1]; 
state_diagram [WE1 ~] 


state [1]: | if RESET then [1] else 
if !CIP~ & LWLR & IMEMCS~ & LA2 then [0]; 


state [0]: if RESET then [1] else 
if !C11 then [0] else 
if C11 then [1]; 
state_diagram [LWIP ~ ] 
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state [1]: if !CO1 # !C11 then [0] else [1]; 


state [0]: if RESET then [1] else 
if !CO1 # !C11 then [0] else [1]; 


state_diagram [MENO ~ ] 


state [1]: if RESET then [1] else 


if !CIP~ & LW_R & IMEMCS ~ & !LA2 then [0]; 


state [0]: if RESET then [1] else 
if !CO1 & CASO1 ~ then [0] else 


if ICIP~ & LW_LR & IMEMCS~ & !LA2 & !CASO1 ~ then [0] else 


if !CASO1 ~ then [1]; 
state_diagram [MEN1 ~] 


state [1]: if RESET then [1] else 


if ICIP~ & LW_R & IMEMCS~ & LA2 then [0]; 


state [0]: if RESET then [1] else 
if !C11 & CAS11~ then [0] else 


if !CIP~ & LW_R & IMEMCS~ & !LA2 & !CAS11~ then [0] else 


if 1CAS11~then [1]; 
equations 
IWIP~ = ILWIP~ # !C0O1 # !C11; 


“test_vectors 


“([{CLK,M_IO ~ ,CIP ~ ,MEMCS ~ ,HIT ~ ,RFACK,ADS ~ ,;W_R, RESET,CASO ~ ,BOFF ~ ] 
-> [BLAST ~ ,RAS ~ ,MRDY ~ ,DALE ~ ,LWR,BRDY ~ ]) 


sé 


sé 


CMAMHRAWRCBBR MDLB 
“L.-.QEIFD_EAOLA RAWR 
“KIOMTASRSSFAS DLRD 

“ O~C-~F~ EOFS~ YE Y 

7 S K T~ ~T ~~ ~ 

 [C, x, x, X, X, X, 1, x, 1, X, x, x, xX] — > [X, x, x, x]} 
“To, x, 1, 1, x, x, 1, X, 1, xX, xX, x, x] — > [1, 0, 0, 1]; 
“ [e, 1, 1, 1, x, 1, 1, x, 0, 1,0, 1, 1] — > [1, 0, 0, 1]; 
“[c, 1, 1, 1, x, 1, 1, x, 0, 1,0, 1, 1] — > [1, 0, 0, 1]; 
“fe, 1, 1, 1, x, 1, 0, 1,0, 1,0, 1, 1] — > [1, 1, 1, 1]; 
“ [e, 1, 0, 0, O, 1, 1, x, 0, 1,0, 1, 1] — > [0, 0, x, 1]; 
“ [c, 1, 0, O, O, 1, 1, x, O, 1, 0, 1,0] — > [1, 0, x, 1]; 
“[c, 1, 1,0, 0, 1, 0, 1,0, 1,0, 1,0] —> [0, 1, 1, 1]; 
“ [e, 1, 0, 0, 0, 1, 1, x, O, 0, O, 1, 0] — > [1, 0, x, 1]; 
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module SC.MODE DRAM CTRL.8 flag ‘-r4 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 8, INTEL CORPORATION’ 


‘“ This PLD generates CAS1 (CAS for bank 1) 
‘Implemented with the Intel 85C220 EPLD. 


SC8 device ‘E0320’; 
x = x 
Cc = C.; 

“ Inputs 
CLK pin 1; “P4 input CLK” 
RFACK 
CIP ~ pin 3; “Cycle OK 
LA2 pin 4; “Latched A2. 
HIT ~ pin 5 
BOFF ~ pin 6 
LW_R~ pin 7; “ 
RAS ~ pin 8; “ 
RESET pin 9; 
RDY ~ pin 12; “ 
MEMCS ~ pin 13; 
BRDY ~ pin 18; 
BLAST~ pin 19; 

“ Output 
CAS10~ _ pin 14; 
C1 pin 15; “ 
C2 pin 16; “ 

. CAS11~ 


state_diagram [CAS10~ ,CAS11 ~ ,C1,C2] 


state [0, 0, 0, O}: 


state [1, 1, 1, 1]: 


state [1, 1, 0, 1]: 


state [1, 1, 0, O]: 


“ ABEL ‘don’t care’ symbol 
‘“ ABEL ‘clocking input’ symbol 


pin 2; “Refresh Acknowledge 


; “DRAM Page Hit Signal 


; “Backoff input to P4’’ 


if RESET # !BOFF~ then [1, 1, 1, 1] else 
if IRFACK & ICIP~ & LA2 & LW.R~ & IMEMCS~ then 
[1, 1, 0, 1] else if !RFACK & ICIP~ & ILW R~ & !RAS~ 
& IHIT~ & IMEMCS~ # (RFACK & RAS~) then 


“System Reset 
Processor RDY# 
“Memory Chip Select 
‘““ Processor BREADY# 
‘‘ Processor BLAST# 


‘“CAS1 byte 0,2 
state variable 
state variable 
pin 17; “ CAS1 byte 1,3 


[O, O, 1, 1} else [1, 1, 1, 1]; 


if RESET # !BOFF~ then [1, 1, 1, 1] else 
if RAS~ & RDY~ then [0, 0, 0, 0] else 


[1, 1, 0, 1]: 


if RESET # !BOFF~ then [1, 1, 1, 1] else 
if ICIP~ & LA2 & LW.R~ & IMEMCS ~ then [1, 1, 0, 0] else 


if CIP~ # (ICIP~ & (MEMCS~ # ILWR~)) # (ICIP~ & LWR~& 


ILA2) then [1, 1, 1, 1] else [0, 0, 0, 0}; 


if RAS ~ then [0, 0, 0, 0] else [1, 1, 0, 0]; 
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state [0, 0, 1, 1]: if RESET # !BOFF~ then [1, 1, 1, 1] else 
if |BRDY~ & IBLAST~ & IRFACK then 
- [1, 1, 1, 1] else 
if IBRDY ~ & BLAST~ & LA2 then [1, 1, 1, 0] else 
if iBRDY~ & BLAST ~ & !LA2 then [0, 0, 1, O] else 
if RFACK then [0, 0, 1, 0] else 
if BRDY~ & !RFACK then [0, 0, 1, 1]; 


state [1, 1, 1, O]: if RESET then [1, 1, 1, 1] else 
if IBOFF~ then [1, 1, 1, O] else | 
if IBRDY~ & BLAST ~ then [0, 0, 1, 1] else 
if |BRDY~ & !BLAST~ then [1, 1, 1, 1] else | 
Ti, 1, 1, OJ; 


state [0, 0, 1, O]: if RESET # !BOFF~ then [1, 1, 1, : else 
if IBRDY~ & BLAST ~ then 1, 1, 1, 0] else 
if IBRDY~ & IBLAST~ # BRDY ~ then [1, 1, 1, 1]; | 
test_vectors 


({CLK,RFACK,CIP ~ ,LA2, HIT, ~ BOFF ~ ,LW_R~ ,RAS ~ ,RESET,RDY ~ ,MEMCS ~ ,BRDY ~] 
—~>  [BLAST~,CAS10~,C1,C2,CAS11~]) 


“CRALHBLRRRMBB CCCC 

“LFEFQAIOWAEDERL A12A 

“KAO2TFRSSYMDA S S 

““C~ ~F~~E~CYS 0 0 

“ K ~ T S T O 1. 
[C, X, X, X, X, X, 1, x, 1, xX, X, xX, xX] — > [x, x, x, x]; 
[c, x, 1, 1, x, x, 1, x, 1, x, x, x, x] — > [1, 1, 1, x] 
[c, x, 1, 1, x, 1, 1,0, 0, 1,0,1,1] —> [1, 1,1, 1]};_ 
[c, 0, 1, 1, X, 1, 1, O, 0, i: 0, 1, 1] eg [1, 1, 1, 1]; 
[c, O, 1, 1, x, 1, 1, 0, 0, 1,0, 1, 1] — > [1, 1, 1, 1]; 
[c, O, 0, 1,0, 1, 1,0, 0, 1,0, 1,1] — > [1, 0, 1, 1]; 
[c, O, O, 1, 0, 1, 1, 0, O, O, 0, 1, 1]. —> [1, 0, 1, 1]; 
[c, 0, 1, 1, 0, 1, 1, 0, 0, 1, 0, 1, 1) — > [0, 0, 0, 0]; 
[c, 0, O, 1, 0, 1, 1,0, 0, 0,0, 1,1] — > [1, 0, 0, 1]; 
[c, 0, 1, x, O, 1, 1, 0, O, 1, 0, 1, 1] — > [0, 0, 0, 0]; 
[c, 0, O, 1, 0, 1, 1, 0, O, O, 0, 1, 1] — > [1, 0, 0, 1]; 
[c, O, 1, x, O, 1, 1,0, 0, 1,0, 1,1] — > [0, 0, 0, OJ; 
[c, 0, 1, O, x, 1, 1, 0, 0, 1,0, 1,1] —> [1, 1, 1, 1]; 

end SC_MODE_DRAM_CTRL_8; 
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module PG MODE _DRAM_CTRL.2 flag ‘-r4’ 


title ‘PAGE MODE DRAM CONTROLLER - PLD 2, INTEL CORPORATION’ 
“ This PLD generates CASO 
“ Implemented with the Intel 85C220 EPLD. 


SC2 device ‘E0320’; 


“ Inputs 


XxX ‘ ABEL ‘don’t care’ symbol 
C.; ‘“ ABEL ‘clocking input’ symbol 


CLK pin 1; “P4 input CLK” 


RFACK pin 
CIP ~ pin 


2; “Refresh Acknowledge 
3; “Cycle OK 


LA2 pin 4; ‘‘Latched A2. 


HIT ~ pin 5; “DRAM Page Hit Signal 
BOFF ~ pin. 6; “Backoff input to P4’’ 
LW_R~ pin 7; “ 

RAS ~ pin 8; “ 

RESET pin 9; “System Reset 

RDY ~ pin 12; “Processor RDY# 


MEMCS ~ pin 13; ‘Memory Chip Select 
BRDY ~ pin 18; “Processor BREADY# 
BLAST~ __ pin 19; “Processor BLAST# 


“ Output 
CAS10~ _ pin 14; “ CAS1 byte 0,2 
C1 pin 15; “ state variable 
C2 pin 16; “ state variable 


~CAS11~ pin 17; “ CAS1 byte 1,3 


state_diagram [CAS10~, CAS11~, C1, C2] 


state [1, 1, 1, 1]: 


state [1, 1, 0, 1]: 


state [0, 0, 0, 0}: 


state [1, 1, 0, 0]: 


if RESET # !BOFF~ then [1, 1, 1, 1] else 
if |RFACK & !CIP~ & !LA2 & LW R~ & IMEMCS~ then 
—[1, 1, 0, 1] else if !RFACK & !CIP~ & ILW_R~ & !RAS~ 
& IHIT~ & IMEMCS~ # (RFACK & RAS ~) then 
[O, 0, 1, 1] else [1, 1, 1, 1]; 


if RESET # !BOFF ~ then [1, 1, 1, 1] else 
if {RAS~ & RDY~ then [0, 0, 0, 0] else 
[1, 1, 0, 1]; 


if RESET # !BOFF ~ then [1, 1, 1, 1] else 
if ICIP~ & ILA2 & LW_R~ & IMEMCS ~ then [1, 1, 0, 0] else 


if CIP~ # (!CIP~ & (MEMCS~ # !LW.R)) # (CIP — & LWR~ & 
LA2) then [1, 1, 1, 1] else 0, 0, 0, 0}; 


if RAS ~ then [0, 0, 0, 0] else [1, 1, 0, 0]; 
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state (0, 0, 1, 1]: if RESET # IBOFF~ then [1, 1, 1, 1] else 
if !|BRDY~ & IBLAST~ & !RFACK then 
—[1, 1, 1, 1] else 
if |BRDY ~ & BLAST~ & ILA2 then [1, 1, 4, 0) esi 
it |IBRDY~ & BLAST~ & LA2 then (0, 0, 1, 0] else 
if RFACK then [0, 0, 1, 0] else 
if BRDY~ & IRFACK then [0, 0, 1, 1]; 


state [1, 1, 1, O]: if RESET then [1, 1, 1, 1] else 
| ‘if IBOFF~ then [1, 1, 1, O] else 
if |BRDY~ & BLAST~ then [0, 0, 1, 1] else 
if !|BRDY~ & !BLAST~ then [1, 1, 1, 1] else 
[1, 1, 1, 0]; 


state [0, 0, 1, 0]: if RESET # !BOFF~ then [1,-1, 1, 1] else. 
| if |BRDY~ & BLAST ~ then [1, 1, 1, 0] else 
if IBRDY~ & !BLAST~ # BRDY~ then [1, 1, 1, 1]; 
test. vectors 


([(CLK,RFACK,CIP ~ ,LA2,HIT ~ ,BOFF ~ ,LW_R~ RAS ~ ,RESET,RDY ~ ,MEMCS ~ BRDY ~ a 
—>  [BLAST~,CAS10~,C1,C2,CAS11~]) 


“CRALHBLRRRMBB CCCC 


“LFQAIOWAEDERL A12A 
“KAO2TFRSSYMDA S§_ § 
“Cx =FPx-ExHCYS 6 0 
“ K ~ T ST 01 


one ~_ — 


[C, xX, X, X, X, X, 1, X, 1, x, xX, X, xX] — > [x, x, xX, x]; 
[c, x, 1, 0, x, x, 1, x, 1, xX, X, x, x] — > [1, 1, 1, 1]; 
[c, 0, 1, 0, x, 1, 1,0, 0, 1,0, 1,1) —> [1, 1,1, 1}; © 
[c, 0, 1, 0, X, 1, 1, 0, O, 1, 0, 1, 1] ae (1, 1; 1, 1]; 
[c, 0, 1, 0, x, 1, 1, 0, 0, 1,0, 1, 1] —> [1, 1, 1, 1]; 
[c, 0, 0, 0, 0, 1, a; 0, 0, 1; 0, 1, 1] -> (1, 0, 1, 1]; 
[c, 0, 0, O, O, 1, 1, 0, 0, 0, 0, 1, 1] —> [1, 0, 1, 1]; 
[c, 0, 1, 0, 0, 1, 1, 0, 0, 1,0, 1, 1] — > [0, 0, 0, 0]; 
[c, 0, 0, 0, 0, 1, 1, O, 0, 0, 0, 1, 1] a [1, 0, 0, 1]; 
[c, 0, 1, X, 0, 1, 1, 0, 0, 1, 0, 1, 1] —-> [O, 0, 0, 0}; 
[c, 0, 0, O, O, 1, 1, 0, O, 0, O, 1, 1] — > [1, 0, 0, 1]; 
[c, 0, 1, X, 0, 1, 1, 0, 0, 1, 0, 1, 1] Se [0, 0, 0, 0}; 
[c, O, 1, 1, x, 1, 1, 0,0, 1,0, 1, 1] —> [1, 1, 1, 1]; 
end PG. MODE_DRAM.CTRL2; 
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module SC_MODE_DRAM_CTRL_15 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 15, INTEL CORPORATION’ 
“This PLD combines ready signals 
‘“ Implemented with the Intel 85C220 EPLD. 


SC15K device ‘E0320’: 


X =  .X,; ‘’ ABEL ‘don’t care’ symbol 

Cc = C:; “ ABEL ‘clocking input’ symbol 
“ Inputs 

MEMCS~ pin 1; “ 

JRDY ~ pin 2; “ 

MRDY ~ pin 3; “ 

BRDY ~ pin 4; ° 

ALD pin 5; “ 

CKEN ~ pin 6; “ 

SKEN~ _ pin 7; “ 

BRDYO~ pin 8;“ 

M~ pin 9; “miss indicator for CIP ~ 

CIP.~ pin 11; ‘ Cycle indicator 
‘ Output 

WEN ~ pin 12; “Write enable for write latches 

RDY ~ pin 13; ‘to 486 


MRDYCS ~ pin 14; ‘ | 
MALD~ _ pin 15; “Modified ALD for FF’s 
dum10 pin 16; “ 

PBRDY~ pin 17; ‘ 
KEN ~ pin 18; “ 
DRDY ~ pin 19; “ 


a 


equations 
IMALD~ = (IMEMCS~ & !ALD): 
IRDY~ = (IMRDY~ & M~ & IMEMCS~) # IJRDY~: 
IMRDYCS~ = (IMRDY~ & M~ & IMEMCS~); 
IWEN~ = ICIP~ & M~: 
IDRDY ~ = IBRDY~ # IMRDYCS~; 
KEN~ = SKEN~ & CKEN~: 


PBRDY~ = BRDY~ & BRDYO~:; 
“test_vectors 
240799-53 
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“(CLK.RESET] -> 
“[RESETO]) 


i [c, 0) => [x]; 
“[c, 0] — > [0]; 
“[e, 0] — > [0]; 
“{c, 0] - > [0]; 

. [c, 0} me see [0]; 
“Tc, 0} — > [0]; 
“{[c, 0] — > [0]; 
“le, 1 —> fl]; 
“[e, 1] -— > [1]; 
“foe, 1] -— > [1]; 
“{e, 1] - > [1]; 

‘ [c, 1] age [1]; 
“fe, 1] - > [1]; 
“Te, 1] -—> [1]; 
“[e, 1] -> [ij 
“[c, 0] —> [0]; © 
“fo, 0] — > [0]. 
“[e,0] -> [O); 
end SC_MODE_DRAM _CTRL_15; 
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module SC_MODE_DRAM_CTRL_17 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 17, INTEL CORPORATION’ 
‘This PLD generates the AO signal for bank 1 
“Implemented with the Intel 85C224 EPLD. 


SC17 device ‘E224’: 


X.; ‘“ ABEL ‘don’t care’ symbol 
.C.; ‘“ ABEL ‘clocking input’ symbol 


“ Inputs 


CLK pin 1; “P4 input CLK” 


BRDY ~ pin 2; “Burst Ready 
CIP ~ pin 3; “Cycle OK 
MEMCS~ pin “memory select 


4; 
LA313 pin 5; “Latched A2. 
DATASEL pin 6; ‘Refresh acknowledge” 
RAS ~ pin 7; “Row address strobe 
LW_R pin 8; “CPU W/R latched ~ 

- RESET pin 9; “System Reset 
BLAST~ __ pin 10; “CPU BLAST ~ output 


A3 pin 11; “CPU Backoff input 

ALD pin 14; “Address Latch disable 

dum1 pin 15; 

WE1 ~ pin 22; “Write enable 

dum2 pin 23; ‘Address Latch disable 
“ Output 


B10MAO _ pin 21; “Bank 1 AO 
B1A pin 20; “Burst A3 bank0 


CSO ~ pin 19; ‘‘ state variable 
dun pin 18; “ state variable 
dum pin 17; “ Burst A3 bank1 


B11MAO __ pin 16; “Bank 1 AO 
state_diagram [B1A, CSO~] 


state [1, 1]: if RESET then [1, 1] else 
if CIP~ & !ALD & !A3 then [0, 1] else 
if !CIP~ & !IALD & !A3 then [0, 1] else 
if !CIP~ & !ILW_LR & IMEMCS~ & WE1~ then [1, 0] else [1, 4]; 


state [0, 1]: if RESET then [1, 1] else 
if CIP~ & !ALD & A3 then [1, 1] else 
if ICIP~ & !ALD & A3 then [1, 1] else 
if (CIP~ & ILW_R & IMEMCS~ & WE1~— then [0, 0} else [0, 1]; 


state [1, 0]: if RESET # (IBRDY~ & !BLAST~) then [1, 1] else 


if IBRDY~ & DATASEL then (0, 0] else [1, 0]; 
: 240799-55 
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state [0, 0]: if RESET # (IBRDY~ & !BLAST~) then [1, 1] else 
if IBRDY~ & DATASEL then [1, 0] else [0, 0]; 


equations 


IB10MAO = !WE1~ & !LA313 # WE1~ & RAS~ & !LA313 # WE1~ & IRAS~ & IBIA; 


1B11MAO = !WE1~ & !LA313 # WE1~ & RAS~ & !LA313 # WE1~ & IRAS~ & IB1A; 
end SC_MODE_DRAM_CTRL_17; 
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module SC_MODE_DRAM_CTRL6 flag ‘-r4’ 


title ‘STATIC COLUMN MODE DRAM CONTROLLER - PLD 6, INTEL CORPORATION’ 
“This PLD generates AO for bank 0 
“ Implemented with the Intel 85C224 EPLD. 


SC6 device ‘E224’: 


= .X,; ‘ ABEL ‘don’t care’ symbol 
Cc = C:,; “ ABEL ‘clocking input’ symbol 


“ Inputs 


CLK pin 1; “P4 input CLK” 

BRDY ~ pin 2; “Burst Ready 

CIP ~ pin 3; “Cycle OK 

MEMCS~ pin 4; ‘memory select 

LA313 pin 5; ‘“‘Latched A2. 

DATASEL pin 6; ‘Refresh acknowledge’”’ 
RAS ~ pin 7; “Row address strobe 
LW_R pin 8; “CPU W/R latched ~ 
RESET pin 9; “System Reset 
BLAST~ pin 10; “CPU BLAST ~ output . 


A3 pin 11; “CPU Backoff input 

ALD pin 14; ‘Address Latch disable 

dumt1 pin 15; 

WE0 ~ pin 22; “Write enable | 

dum2 pin 23; “Address Latch disable 
“ Output 


BOOMAO __ pin 21; “Bank 0 AO 
BOA pin 20; “ Burst A3 bank 0 


CSO~ pin 19; “ state variable 
dun pin 18; “ state variable 
dum pin 17; “ Burst A3 bank1 


BO1MAO _ pin 16; “Bank 0 AO 
State_diagram [BOA, CSO ~] 


state [1, 1]: if RESET then [1, 1] else 
if CIP~ & !ALD & !A3 then [0, 1] else 
if !CIP~ & !IALD & !A3 then [0, 1] else 
if !CIP~ & ILW_R & IMEMCS~ & WEO~ then [1, 0] else [1, 1]; 


state [0, 1]: if RESET then [1, 1] else 
if CIP~ & !ALD & A3 then [1, 1] else 
if 'CIP~ & !IALD & A3 then [1, 1] else 
if ICIP~ & ILW_R & IMEMCS~ & WEO~ then [0, 0] else [0, 1]; 


state [1, 0]: if RESET # (IBRDY~ & !BLAST~) then [1, 1] else 
if |BRDY~ & !DATASEL then [0, 0] else [1, 0]; 
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state [0, 0]: if RESET # (!BRDY~ & !BLAST~) then [1, 1] else 
_ if !BRDY~ & IDATASEL then [1, 0] else [0, 0]; 


equations 


IBOOMAO = !WEO~ & !LA313 # WEO~ & RAS~ & !LA313 # WEO~ & IRAS~ & !BOA;. 


IBOIMAO = IWEO~ & !LA313 # WEO~ & RAS~ & !LA313 # WEO~ & IRAS~ & !BOA: 
end SC_MODE_DRAM.CTRL.6: 
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CLK12.1 ~> UI0 PINI3, U23 PINT3, UI2 PIN13, JI1,J2,J3 PINI 


CLK1.2 -> $C4 PIN1, SCO PINT, SC11 PING, SC2 PING 
CLKI.3 -> PAOC. PINCS, SCI PING, UG PINI3, UB7 PINI3, 
CLK1.4 -> S$C417 PIN1, SC? PINT, SCE PIN1, SC3 PIN 


CLK1.$5 -> U30 PINT3, U25 PINIT3, UZ9 PINI3, UZ2Z1PIN13, U24 PINT 
CLK1.6 ~> U20 PIN13,U27PIN239,U33 PINI3, UBS PINZ811, VEG PINI 
CLK1.7 -> U4S PINT, USE PINTO, U47 PIN1, U4S PINZ, USE PING, 
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386™ DX MICROPROCESSOR 


HIGH PERFORMANCE 32-BIT CHMOS MICROPROCESSOR 
WITH INTEGRATED MEMORY MANAGEMENT 


Flexible 32-Bit Microprocessor 
— 8, 16, 32-Bit Data Types 
— 8 General Purpose 32-Bit Registers 


Very Large Address Space 

— 4 Gigabyte Physical 

— 64 Terabyte Virtual 

— 4 Gigabyte Maximum Segment Size 


Integrated Memory Management Unit 
— Virtual Memory Support 

— Optional On-Chip Paging 

— 4 Levels of Protection 

— Fully Compatible with 80286 


Object Code Compatible with All 8086 
Family Microprocessors 

Virtual 8086 Mode Allows Running of 
8086 Software in a Protected and 
Paged System 


m Hardware Debugging Support 


Optimized for System Performance 

— Pipelined Instruction Execution 

— On-Chip Address Translation Caches 

— 20, 25 and 33 MHz Clock 

— 40, 50 and 66 Megabytes/Sec Bus 
Bandwidth 


High Speed Numerics Support via 387 

DX™ Coprocessor 

Complete System Development 

Support 

— Software: C, PL/M, Assembler 
System Generation Tools 

— Debuggers: PSCOPE, ICET™-386 

High Speed CHMOS Ill and CHMOS IV 

Technology 


132 Pin Grid Array Package 
‘(See Packaging Specification, Order #231369) 


The 386™ DX Microprocessor is an advanced 32-bit microprocessor designed for applications needing very 
high performance and optimized for multitasking operating systems. The 32-bit registers and data paths 
support 32-bit addresses and data types. The processor addresses up to four gigabytes of physical memory 
and 64 terabytes (2**46) of virtual memory. The integrated memory management and protection architecture 
includes address translation registers, advanced multitasking hardware and a protection mechanism to sup- 
port operating systems. In addition, the 386 DX allows the simultaneous running of multiple operating systems. 
Instruction pipelining, on-chip address translation, and high bus bandwidth ensure short average instruction 


execution times and high system throughput. 


_ The 386 DX offers new testability and debugging features. Testability features include a self-test and direct 
access to the page translation cache. Four new breakpoint registers provide breakpoint traps on code execu- 
tion or data accesses, for powerful debugging of even ROM-based systems. 


Object-code compatibility with all 8086 family members (8086, 8088, 80186, 80188, 80286) means the 386 DX 
offers immediate access to the world’s largest microprocessor software base. 
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386™ DX Pipelined 32-Bit Microarchitecture 


386T™ DX and 387™ DX are Trademarks of Intel Corporation. 
UNIX™ is a Trademark of AT&T Bell Labs. 
MS-DOS is a Trademark of MICROSOFT Corporation. 
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1 Voc and GND connections must be made to multi-. 
1. PIN ASSIGNMENT ple Voc and Vss (GND) pins. Each Vcc and Vgs 
The 386 DX pinout as viewed from the top side of | must be connected to the appropriate voltage level. 
the component is shown by Figure 1-1. Its pinout as The circuit board should include Vcc and GND 
viewed from the Pin side of the component is Figure planes for power distribution and all Vcc and Vgs 
4-2. | pins must be connected to the appropriate plane. 


NOTE: 
Pins identified as “N. C.”” should remain completely 
| unconnected. 
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Figure 1-1. 386™ DX PGA Figure 1-2. 386™ DX PGA 
Pinout—View from Top Side Pinout—View from Pin Side 


Table 1-1. 386™ DX PGA Eneat ui) Grouping | 
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1.1 PIN DESCRIPTION TABLE 


The following table lists a brief description of each pin on the 386 DX. The following definitions are used in 
these descriptions: 


# The named signal is active LOW. 
| Input signal. | 

O Output signal. 

I/O Input and Output signal. 

— No electrical connection. 


For a more complete description refer to Section 5.2 Signal Description. 


Name and Function 
CLK2 provides the fundamental timing for the 386 DX. 


— Dg1-Do I/O DATA BUS inputs data during memory, !/O and interrupt acknowledge 
read cycles and outputs data during memory and !/O write cycles. 
ADDRESS BUS outputs physical memory or port |/O addresses. 


BEO # -BE3# BYTE ENABLES indicate which data bytes of the data bus take part in 
a bus cycle. 

W/R# WRITE/READ is a bus cycle definition pin that distinguishes write 
cycles from read cycles. 


DATA/CONTROL is a bus cycle definition pin that distinguishes data 
cycles, either memory or |/O, from control cycles which are: interrupt 
acknowledge, halt, and instruction fetching. 


MEMORY 1/0 is a bus cycle definition pin that distinguishes memory 
cycles from input/output cycles. | 

BUS LOCK is a bus cycle definition pin that indicates that other 
system bus masters are denied access to the system bus while it is 


active. 


ADDRESS STATUS indicates that a valid bus cycle definition and 
address (W/R#, D/C#, M/IO#, BEO#, BE1#, BE2#, BE3S# and 
A31—Ag) are being driven at the 386 DX pins. 


NEXT ADDRESS is used to request address pipelining. 
BUS READY terminates the bus cycle. 


BUS SIZE 16 input allows direct connection of 32-bit and 16-bit data 
buses. 

BUS HOLD REQUEST input allows another bus master to request | 
control of the local bus. 


READY # 


HOLD 
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1.1 PIN DESCRIPTION TABLE (Continued) 
: a ey Be . Name and Function | 


BUS HOLD ACKNOWLEDGE output indicates that the 386 DX has 
surrendered control of its local bus to another bus master. © 


a aerrisr _ 


BUSY Signais a Dusy Condition from a processor extension. 


ERROR signals an error condition from a processor extension. 


PROCESSOR EXTENSION REQUEST indicates that the processor 

extension has data to be transferred by the 386 DX. 

INTERRUPT REQUEST is a maskable input that signals the 386 DX to 
- suspend execution of the current program and execute an interrupt 

acknowledge function. | | : 


NON-MASKABLE INTERRUPT REQUEST is a non-maskable input 
that signals the 386 DX to suspend execution of the current program 
and execute an interrupt acknowledge function. | 


RESET suspends any operation in progress and places the 386 DX in 
a known reset state. See Interrupt Signals for additional information. 


NO CONNECT should always remain unconnected. Connection of a 
N/C pin may cause the processor to malfunction or be incompatible 
with future steppings of the 386 DX. . 


SYSTEM POWER provides the +5V nominal D.C. supply input. 


_ SYSTEM GROUND provides OV connection from which all inputs and 
outputs are measured. | 
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2. BASE ARCHITECTURE 


2.1 INTRODUCTION 


- The 386 DX consists of a central processing unit, a 
memory management unit and a bus interface. 


The central processing unit consists of the execu- 
tion unit and instruction unit. The execution unit con- 
tains the eight 32-bit general purpose registers 
which are used for both address calculation, data 
operations and a 64-bit barrel shifter used to speed 
shift, rotate, multiply, and divide operations. The 
multiply and divide logic uses a 1-bit per cycle algo- 
rithm. The multiply algorithm stops the iteration 
when the most significant bits of the multiplier are all 
zero. This allows typical 32-bit multiplies to be exe- 
cuted in under one microsecond. The instruction unit 
decodes the instruction opcodes and stores them in 
the decoded instruction queue for immediate use by 
the execution unit. 


The memory management unit (MMU) consists of a 
segmentation unit and a paging unit. Segmentation 
allows the managing of the logical address space by 
providing an extra addressing component, one that 
allows easy code and data relocatability, and effi- 
cient sharing. The paging mechanism operates be- 
neath and is transparent to the segmentation pro- 
cess, to allow management of the physical address 
space. Each segment is divided into one or more 4K 
byte pages. To implement a virtual memory system, 
the 386 DX supports full restartability for all page 
and segment faults. 


Memory is organized into one or more variable 
length segments, each up to four gigabytes in size. A 
given region of the linear address space, a segment, 
can have attributes associated with it. These attri- 
butes include its location, size, type (i.e. stack, code 
or data), and protection characteristics. Each task 
on an 386 DX can have a maximum of 16,381 seg- 
ments of up to four gigabytes each, thus providing 
64 terabytes (trillion bytes) of virtual memory to each 
task. : 


The segmentation unit provides four-levels of pro- 
tection for isolating and protecting applications and 
the operating system from each other. The hardware 


enforced protection allows the design of systems — 


with a high degree of integrity. 


The 386 DX has two modes of operation: Real Ad- 
dress Mode (Real Mode), and Protected Virtual Ad- 
dress Mode (Protected Mode). In Real Mode the 


386™ DX MICROPROCESSOR 


386 DX operates as a very fast 8086, but with 32-bit 
extensions if desired. Real Mode is required primari- 
ly to setup the processor for Protected Mode opera- 
tion. Protected Mode provides access to the sophis- 
ticated memory management, paging and privilege 
capabilities of the processor. 


Within Protected Mode, software can perform a task 
switch to enter into tasks designated as Virtual 8086 
Mode tasks. Each such task behaves with 8086 se- 
mantics, thus allowing 8086 software (an application 
program, or an entire operating system) to execute. 
The Virtual 8086 tasks can be isolated and protect- 
ed from one another and the host 386 DX operating 
system, by the use of paging, and the I/O Permis- 
sion Bitmap. 


Finally, to facilitate high performance system hard- 
ware designs, the 386 DX bus interface offers ad- 
dress pipelining, dynamic data bus sizing, and direct 
Byte Enable signals for each byte of the data bus. 
These hardware features are described fully begin- 
ning in Section 5. 


2.2 REGISTER OVERVIEW 


The 386 DX has 32 register resources in the follow- 
ing categories: 


¢ General Purpose Registers 
e Segment Registers 

e Instruction Pointer and Flags 
Control Registers 

System Address Registers 

e Debug Registers 

e Test Registers. 


The registers are a superset of the 8086, 80186 and 
80286 registers, so all 16-bit 8086, 80186 and 
80286 registers are contained within the 32-bit 386 
DX. 


Figure 2-1 shows all of 386 DX base architecture 
registers, which include the general address and 
data registers, the instruction pointer, and the flags 
register. The contents of these registers are task- 
specific, so these registers are automatically loaded 
with a new context upon a task switch operation. 


The base architecture also includes six directly ac- 
cessible segments, each up to 4 Gbytes in size. The 
segments are indicated by the selector values 
placed in 386 DX segment registers of Figure 2-1. 
Various selector values can be loaded as a program 
executes, if desired. : 
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GENERAL DATA AND ADDRESS REGISTERS 
31 1615 7 


INSTRUCTION POINTER 
AND FLAGS REGISTER 
31 16 15 


Figure 2-4. 386™ DX Base 
Architecture Registers 


The selectors are also task-specific, so the segment 
registers are automatically loaded with new context 
upon a task switch operation. 


The other types of registers, Control, System Ad- 
dress, Debug, and Test, are primarily used by sys- 
tem software. 


2.3 REGISTER DESCRIPTIONS 


2.3.1 General Purpose Registers 


General Purpose Registers: The eight general pur- 
pose registers of 32 bits hold data or address quanti- 
ties. The general registers, Figure 2-2, support data 
operands of 1, 8, 16, 32 and 64 bits, and bit fields of 
1 to 32 bits. They support address operands of 16 
and 32 bits. The 32-bit registers are named EAX, 
EBX, ECX, EDX, ESI, EDI, EBP, and ESP. 


The least significant 16 bits of the registers can be 
accessed separately. This is done by using the 16- 
bit names of the registers AX, BX, CX, DX, Sl, DI, 
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BP, and SP. When. accessed as a 16-bit operand, 
the upper 16 bits of the register are neither used nor 
changed. 


Finally 8-bit operations can individually access the © 
lowest byte (bits 0-7) and the higher byte (hits 2- 
15) of general purpose registers AX, BX, CX and DX. 
The lowest bytes are named AL, BL, CL and DL, 
respectively. The higher bytes are named AH, BH, 
CH and DH, respectively. The individual byte acces- 
sibility offers additional flexibility for data operations, — 
but is not used for effective address calculation. 


Figure 2-2. General Registers 
and Instruction Pointer 


2.3.2 Instruction Pointer. 


The instruction pointer, Figure 2-2, is a 32-bit regis- 
ter named EIP. EIP holds the offset of the next in- 
struction to be executed. The offset is always rela- 
tive to the base of the code segment (CS). The low- 
er 16 bits (bits O-—15) of EIP contain the 16-bit in- 
struction pointer. named IP, which is used by 16-bit 
addressing. | 


2.3.3 Flags Register 


The Flags Register is a 32-bit register named 
EFLAGS. The defined bits and bit fields within 


_ EFLAGS, shown in Figure 2-3, control certain opera- 


tions and indicate status of the 386 DX. The lower 
16 bits (bit 0-15) of EFLAGS contain the 16-bit flag 
register named FLAGS, which is most useful when 
executing 8086 and 80286 code. 
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VIRTUAL MODE 
RESUME FLAG 
NESTED TASK FLAG 
1/O PRIVILEGE LEVEL 
OVERFLOW 
DIRECTION FLAG 
INTERRUPT ENABLE 


ndicates Intel reserved: do not define; see section 2.3.10. 
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FLAGS 


1 1 
109876543210 


) tol : 


CARRY FLAG 
PARITY FLAG 
AUXILIARY CARRY 
ZERO FLAG 

SIGN FLAG 

TRAP FLAG 


231630-—50 


Figure 2-3. Flags Register 


(Virtual 8086 Mode, bit 17) 


The VM bit provides Virtual 8086 Mode within 
Protected Mode. If set while the 386 DX is in 


Protected Mode, the 386 DX will switch to Vir- 


tual 8086 operation, handling segment loads 
as the 8086 does, but generating exception 
13 faults on privileged opcodes. The VM bit 
can be set only in Protected Mode, by the 
IRET instruction (if current privilege level = 
0) and by task switches at any privilege level. 
The VM bit is unaffected by POPF. PUSHF 
always pushes a 0 in this bit, even if execut- 
ing in virtual 8086 Mode. The EFLAGS image 
pushed during interrupt processing or saved 
during task switches will contain a 1 in this bit 
if the interrupted code was executing as a Vir- 
tual 8086 Task. 


(Resume Flag, bit 16) 


The RF flag is used in conjunction with the 
debug register breakpoints. It is checked at 
instruction boundaries before breakpoint pro- 
cessing. When RFF is set, it causes any debug 
fault to be ignored on the next instruction. RF 
is then automatically reset at the successful 
completion of every instruction (no faults are 
signalled) except the IRET instruction, the 
POPF instruction, (and JMP, CALL, and INT 
instructions causing a task switch). These in- 
structions set RF to the value specified by the 
memory image. For example, at the end of 
the breakpoint service routine, the IRET 


NT 


IOPL 
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instruction can pop an EFLAG image having 
the RF bit set and resume the program’s exe- 
cution at the breakpoint address without gen- 
erating another breakpoint fault on the same 
location. 


(Nested Task, bit 14) 


This flag applies to Protected Mode. NT is set 
to indicate that the execution of this task is 
nested within another task. If set, it indicates 
that the current nested task’s Task State 
Segment (TSS) has a valid back link to the 


‘previous task’s TSS. This bit is set or reset by ' 


control transfers to other tasks. The value of 
NT in EFLAGS is tested by the IRET instruc- 
tion to determine whether to do an inter-task 
return or an intra-task return. A POPF or an 
IRET instruction will affect the setting of this 
bit according to the image popped, at any 
orivilege level. 


(Input/Output Privilege Level, bits 12-13) 


This two-bit field applies to Protected Mode. 
IOPL indicates the numerically maximum CPL 
(current privilege level) value permitted to ex- 
ecute I/O instructions without generating an 
exception 13 fault or consulting the I/O Per- 
mission Bitmap. It also indicates the maxi- 
mum CPL value allowing alteration of the IF 
(INTR Enable Flag) bit when new values are 
popped into the EFLAG register. POPF and 
IRET instruction can alter the IOPL field when 
executed at CPL = 0. Task switches can al- 
ways alter the |OPL field, when the new flag 
image is loaded from the incoming task’s 
TSS. 


OF 


DF 


TF 


SF 


Fh 


(Overflow Flag, bit 11) 


OF is set if the operation resulted in a signed 


overflow. Signed overflow occurs when the 
operation resulted in carry/borrow into the 
sign bit (high-order bit) of the result but did 
noi resuii in a carry/borrow oui of tne nignh- 
order bit, or vice-versa. For 8/16/32 bit oper- 
ations, OF is set according to overflow at bit 
7/15/31, respectively. 


(Direction Flag, bit 10) 


DF defines whether ESI and/or EDI registers. 


postdecrement or postincrement during the 
string instructions. Postincrement occurs if 
DF is reset. Postdecrement occurs if DF is 
set. 


(INTR Enable Flag, bit 9) 


The IF flag, when set, allows recognition of 
external interrupts signalled on the INTR pin. 
When IF is reset, external interrupts signalled 
on the INTR are not recognized. IOPL indi- 
cates the maximum CPL value allowing alter- 
ation of the IF bit when new values are 
popped into EFLAGS or FLAGS. : 


(Trap Enable Flag, bit 8) 


TF controls the generation of exception 1 
trap when _ single-stepping through code. 
When TF is set, the 386 DX generates an ex- 
ception 1 trap after the next instruction is exe- 
cuted. When TF is reset, exception 1 traps 
occur only as a function of the breakpoint ad- 


dresses loaded into debug registers ae 
DR3. 


(Sign Flag, bit 7) 
SF is set if the high-order bit of the result is 
set, it is reset otherwise. For 8-, 16-, 32-bit 
operations, SF reflects the state of bit 7, 15, 
31 respectively. 


SEGMENT 
REGISTERS 


Physical Base Address Segment Limit 


[seece Ie 
a 


Figure 2-4. 386™ DX Segment Registers, and Associated Descriptor Registers 


ZF 


AF 


PF 


CF 
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(Zero Flag, bit 6) 


ZF is set if all bits of the result are 0. Other- 
wise it is reset. 


(Auxiliary Carry Flag, bit 4) 


Tha Auxiliary Flag is sod to simolify ade 
WJiit wiry th 1c aca 


tion and subtraction of packed BCD quanti- 
ties. AF is set if the operation resulted in a 
carry out of bit 3 (addition) or a borrow into bit 


_ 3 (subtraction). Otherwise AF is reset. AF is 


affected by carry out of, or borrow into bit 3 
only, regardless of overall operand length: 8, 
16 or 32 bits. 

(Parity Flags, bit 2). 

PF is set if the low-order eight bits of the op- 
eration contains an even number of ‘1’s” 
(even parity). PF is reset if the low-order eight 
bits have odd parity. PF is a function of only 
the low-order eight bits, regardless of oper- 
and size. 


(Carry Flag, bit 0) 


CF is set if the operation resulted in a carry 
out of (addition), or a borrow into (subtraction) 
the high-order bit. Otherwise CF is reset. For 
8-, 16- or 32-bit operations, CF is set accord- 
ing to carry/borrow at bit 7, 15 or 31, respec- 
tively. 


Note in these descriptions, ‘‘set’” means “‘set to 1,” 
and “reset’’ means “reset to 0.” — 


2.3.4 Segment Registers 


Six 16-bit segment registers hold segment selector 
values identifying the currently addressable memory 
segments. Segment registers are shown in Figure 2- 
4. In Protected Mode, each segment may range in 
size from one byte up to the entire linear and physi- 
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DESCRIPTOR REGISTERS (LOADED AUTOMATICALLY) 
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Segment 
Attributes from Descriptor 


intel 


cal space of the machine, 4 Gbytes (232 bytes). If a 
maximum sized segment is used (limit = 
FFFFFFFFH) it should be Dword aligned (i.e., the 
least two significant bits of the segment base should 
be zero). This will avoid a segment limit violation (ex- 
ception 13) caused by the wrap around. In Real Ad- 
dress Mode, the maximum segment size is fixed at 
64 Kbytes (216 bytes). 


The six segments addressable at any given moment 
are defined by the segment registers CS, SS, DS, 
ES, FS and GS. The selector in CS indicates the 
current code segment; the selector in SS indicates 
the current stack segment; the selectors in DS, ES, 
FS and GS indicate the current data segments. 


2.3.5 Segment Descriptor Registers 


The segment descriptor registers are not program- 
mer visible, yet it is very useful to understand their 
content. Inside the 386 DX, a descriptor register 
(programmer invisible) is associated with each pro- 
grammer-visible segment register, as shown by Fig- 
ure 2-4. Each descriptor register holds a 32-bit seg- 
ment base address, a 32-bit segment limit, and the 
other necessary segment attributes. 


When a selector value is loaded into a segment reg- 
ister, the associated descriptor register is automati- 
cally updated with the correct information. In Real 
Address Mode, only the base address is updated 
directly (by shifting the selector value four bits to the 
left), since the segment maximum limit and attributes 
are fixed in Real Mode. In Protected Mode, the base 
address, the limit, and the attributes are all updated 
per the contents of the segment descriptor indexed 
by the selector. 


Whenever a memory reference occurs, the segment 
descriptor register associated with the segment be- 
ing used is automatically involved with the memory 
reference. The 32-bit segment base address be- 
comes a component of the linear address calcula- 
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tion, the 32-bit limit is used for the limit-check opera- 
tion, and the attributes are checked against the type 
of memory reference requested. 


2.3.6 Control Registers 


The 386 DX has three control registers of 32 bits, 
CRO, CR2 and CR3, to hold machine state of a glob- 
al nature (not specific to an individual task). These 
registers, along with System Address Registers de- 
scribed in the next section, hold machine state that 
affects all tasks in the system. To access the Con- 
trol Registers, load and store instructions are de- 
fined. 


CRO: Machine Control Register (includes 80286 
Machine Status Word) : 


CRO, shown in Figure 2-5, contains 6 defined bits for 
control and status purposes. The low-order 16 bits 
of CRO are also known as the Machine Status Word, 
MSW, for compatibility with 80286 Protected Mode. 
LMSW and SMSW instructions are taken as special 
aliases of the load and store CRO operations, where 
only the low-order 16 bits of CRO are involved. For 
compatibility with 80286 operating systems the 386 
DX LMSW instructions work in an identical fashion 
to the LMSW instruction on the 80286. (i.e. It only 
operates on the low-order 16-bits of CRO and it ig- 
nores the new bits in CRO.) New 386 DX operating 
systems should use the MOV CRO, Reg instruction. 


The defined CRO bits are described below. 

PG (Paging Enable, bit 31) 
the PG bit is set to enable the on-chip paging 
unit. It is reset to disable the on-chip paging 
unit. 

R (reserved, bit 4) 


This bit is reserved by Intel. When loading CRO 
care should be taken to not alter the value of 
this bit. 


MSW 


indicates Intel reserved: Do not define; SEE SECTION 2.3.10 


Figure 2-5. Control Register 0 
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TS (Task Switched, bit 3) — 


TS is automatically set whenever a task switch 
_operation is performed. If TS is set, a coproces- 
sor ESCape opcode will cause a Coprocessor 
Not Available trap (exception 7). The trap han- 
dier typically saves the 387 DX coprocessor 
context belonging to a previous task, loads the 
387 DX coprocessor state belonging to the cur- 
rent task, and clears the TS bit before returning 
to the faulting coprocessor opcode. 


EM (Emulate Coprocessor, bit 2) 


The EMulate coprocessor bit is set to cause all 


coprocessor opcodes to generate a Coproces- 
sor Not Available fault (exception 7). It is reset 
to allow coprocessor opcodes to be executed 
on an actual 387 DX coprocessor (this is the 
default case after reset). Note that the WAIT 
opcode is not affected by the EM bit setting. 


MP (Monitor Coprocessor, bit 1) 


The MP bit is used in conjunction with the TS 
bit to determine if the WAIT opcode will gener- 
ate a.Coprocessor Not Available fault (excep- 
tion 7) when TS = 1. When both MP = 1 and 
TS = 1, the WAIT opcode generates a trap. 
Otherwise, the WAIT opcode does not gener- 
ate a trap. Note that TS is automatically set 
whenever a task switch operation is performed. 
PE (Protection Enable, bit 0) 

The PE bit is set to enable the Protected Mode. 
lf PE is reset, the processor operates again in 
Real Mode. PE may be set by loading MSW or 
CRO. PE can be reset only by a load into CRO. 
Resetting the PE bit is typically part of a longer 
instruction sequence needed for proper tran- 
sition from Protected Mode to Real Mode. Note 
that for strict 80286 compatibility, PE cannot be 
reset by the LMSW instruction. 


CR1: reserved 
CR1 is reserved for use in future Intel processors. 
CR2: Page Fault Linear Address 


CR2, shown in Figure 2-6, holds the 32-bit linear ad- 
dress that caused the last page fault detected. The 
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error code pushed onto the page fault handler’s 
stack when it is invoked ‘provides additional status 
information on this page fault. 


CR3: Page Directory Base Address | - 


CR3, shown in Figure 2-6, contains the physical 
base address. of the page directory table. The 386 
DX page directory table is always page-aligned 
(4 Kbyte-aligned). Therefore the lowest twelve bits 
of CR3 are ignored when written and they store as 
undefined. } 


A task switch through a TSS which changes the 
value in CR3, or an explicit load into CR3 with any 
value, will invalidate all cached page table entries in 
the paging unit cache. Note that if the value in CR3 
does not change during the task switch, the cached 
page table entries are not flushed. 


2.3.7 System Address Registers 


Four special registers are defined to reference the 
tables or segments supported by the 80286 CPU 
and 386 DX protection model. These tables or seg- | 
ments are: 


GDT (Global Descriptor Table), 
_ IDT (Interrupt Descriptor Table), 

LDT (Local Descriptor Table), 

TSS (Task State Segment). 


The addresses of these tables and segments are — 
stored in special registers, the System Address and 
System Segment Registers illustrated in Figure 2-7. 
These registers are named GDTR, IDTR, LDTR and 
TR, respectively. Section 4 Protected Mode Archi- 
tecture describes the use of these registers. 


GDTR and IDTR 


These registers hold the 32-bit linear base address 
and 16-bit limit of the GDT and IDT, respectively. 


The GDT and IDT segments, since they are global to — 
all tasks in the system, are defined by 32-bit linear 
addresses (subject to page translation if paging is 
enabled) and 16-bit limit values. 


- | PAGE FAULT LINEAR ADDRESS REGISTER | 


PAGE DIRECTORY BASE REGISTER 


indicates Intel reserved: Do not define; SEE SECTION 2.3.10 


Figure 2-6. Control Registers 2 and 3 
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SYSTEM ADDRESS REGISTERS 
47 32-BIT LINEAR BASE ADDRESS 1615 ~— LIMIT 


GDTR 
IDTR 


SYSTEM SEGMENT 
REGISTERS 


15 0 


TR SELECTOR 
LDTR SELECTOR 
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DESCRIPTOR REGISTERS (AUTOMATICALLY LOADED) 


>Fz2I8I#4#X™2”I]™]I]I]I]"-IIIIIV7”@I]-—7-—IIn..-2- eee 
32-BIT LINEAR BASE ADDRESS 


32-BIT SEGMENT LIMIT ATTRIBUTES 


Figure 2-7. System Address and System Segment Registers 


LDTR and TR 


These registers hold the 16-bit selector for the LDT 
descriptor and the TSS descriptor, respectively. 


The LDT and TSS segments, since they are task- 
specific segments, are defined by selector values 
stored in the system segment registers. Note that a 
segment descriptor register (programmer-invisible) 
is associated with each system segment register. 


_ 2.3.8 Debug and Test Registers 


Debug Registers: The six programmer accessible 
debug registers provide on-chip support for debug- 
ging. Debug Registers DRO-3 specify the four linear 
breakpoints. The Debug Control Register DR7 is 
used to set the breakpoints and the Debug Status 
Register DR6,: displays the current state of the 
breakpoints. The use of the debug registers is de- 
scribed in section 2.12 Debugging support. 


DEBUG REGISTERS 
31 0 


Intgl reserved. Do not define. 
BREAKPOINT STATUS 
BREAKPOINT CONTROL 


TEST REGISTERS (FOR PAGE CACHE) 
31 0 


TEST CONTROL TR6 
TEST STATUS TR7 


Figure 2-8. Debug and Test Registers 


Test Registers: Two registers are used to control 
the testing of the RAM/CAM (Content Addressable 
Memories) in the Translation Lookaside Buffer por- 
tion of the 386 DX. TR6 is the command test regis- 
ter, and TR7 is the data register which contains the 
data of the Translation Lookaside buffer test. Their 
use is discussed in section 2.11 Testability. 


Figure 2-8 shows the Debug and Test registers. 


2.3.9 Register Accessibility 


There are a few differences regarding the accessibil- 
ity of the registers in Real and Protected Mode. Ta- 
ble 2-1 summarizes these differences. See Section. 
4 Protected Mode Architecture for further details. 


2.3.10 Compatibility 


VERY IMPORTANT NOTE: 
COMPATIBILITY WITH FUTURE PROCESSORS 


In the preceding register descriptions, note cer- 
tain 386.DX register bits are Intel reserved. 
When reserved bits are called out, treat them as 
fully undefined. This is essential for your soft- 
ware compatibility with future processors! Fol- 
low the guidelines below: 


1) Do not depend on the states of any unde- 
fined bits when testing the values of defined 
register bits. Mask them out when testing. 


2) Do not depend on the states of any unde- 
fined bits when storing them to memory or 
another register. 


3) Do not depend on the ability to retain infor- 
mation written into any undefined bits. 


4) When loading registers always load the unde- 
fined bits as zeros. 
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_ Table 2-1. Register Usage 


Register 


Generai Regisiers — 


NOTES: 


Use in | Use in | _ Use in 
Real Mode Protected Mode Virtual 8086 Mode 


Store — 


PL = 0: The registers can be accessed only when the current privilege level is zero. 
*IOPL: The PUSHF and POPF instructions are made I/O Privilege Level sensitive in Virtual 8086 Mode. 


5) However, registers which have been previ- 
ously stored may be reloaded without mask- 
ing. 


Depending upon the values of undefined regis- | 


ter bits will make your software dependent upon 
the unspecified 386 DX handling of these bits. 
Depending on undefined values risks making 
your software incompatible with future proces- 
sors that define usages for the 386 DX-unde- 
fined bits. AVOID ANY SOFTWARE DEPEN- 
DENCE UPON THE STATE OF UNDEFINED 386 
DX REGISTER BITS. 


2.4 INSTRUCTION SET 


2.4.1 Instruction Set Overview 


The instruction set is divided into nine categories of. 


operations: 
Data Transfer 
Arithmetic 
Shift/Rotate 
String Manipulation: 
Bit Manipulation 
Control Transfer: , 
High Level Language Support 
Operating System Support 
Processor Control 


These 386 DX instructions are listed in Table 9.2. 


All 386 DX instructions operate on either 0, 1, 2, or 3 
operands; where an operand resides in a register, in 
the instruction itself, or in memory. Most zero oper- 
and instructions (e.g. CLI, STI) take only one byte. 
One operand instructions generally are two bytes 
long. The average instruction is 3.2 bytes long. 
Since the 386 DX has a 16-byte instruction queue, 
an average of 5 instructions will be prefetched. The 


_use of two operands permits the following types of 


common instructions: _ ' 
Register to Register . 
Memory to Register 
Immediate to Register 
Register to Memory 
Immediate to Memory. 


The operands can be either 8, 16, or 32 bits long. As 


a general rule, when executing code written for the 
386 DX (32-bit code), operands are 8 or 32 bits; 
when executing existing 80286 or 8086 code (16-bit 
code), operands are 8 or 16 bits. Prefixes can be | 
added to all instructions which override the default 
length of the operands, (i.e. use 32-bit operands for 
16-bit code, or 16-bit operands for 32-bit code). 


For a more elaborate description of the instruction 
set, refer to the “386 DX Programmer's Reference 
Manual.” 
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2.4.2 386T™ DX Instructions 


Table 2-2a. Data Transfer 
GENERAL PURPOSE 

Move operand 

Push operand onto stack 

Pop operand off stack 

Push all registers on stack 

Pop all registers off stack 

Exchange Operand, Register 

Translate 

CONVERSION 


Move byte or Word, Dword, with zero 
extension 


Move byte or Word, Dword, sign 
extended 


OV, 
USH 


HA 
OPA 
CHG 


x) XK} 010 Vise 
> Gi 
+ 


MOVZX 


QO; = 
O 
<= 
” 
x< 

2) 

fe) 

3 

< 

ro) 

> 

o 

= 

© 

So 

= 

Q 

a. 

° 

= 

Q 

a. 

So 

gO 

= 

Q 

QO. - 


BW 


’ 


CWD Convert Word to DWORD 
CWDE Convert Word to DWORD extended 
CDQ Convert DWORD to QWORD 


INPUT/OUTPUT 
Input operand from I/O space 
Output operand to I/O space 

ADDRESS OBJECT 
oad effective address 
oad pointer into D segment register 
oad pointer into E segment register 
Load pointer into F segment register 

|LGS Load pointer into G segment register 


Load pointer into S (Stack) segment 
register 


FLAG MANIPULATION 
Load A register from Flags 
Store A register in Flags 
Push flags onto stack 
Pop flags off stack 
USHFD |Push EFlags onto stack 
OPFD Pop EFlags off stack 
LC Clear Carry Flag _ 
Clear Direction Flag 
Complement Carry Flag 
Set Carry Flag 

TD |Set Direction Flag 


= oO! 
m 

ii 
: 


— Pte Pe 
ep) nNim|O 
iis 


AHF 
AHF 
USHF 
OPF 


2) 


oe 
0 


Tal 


OMIO}O 
jel 
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AND _|“AND” operands 
[OR 
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Table 2-2b. Arithmetic Instructions 


ADDITION 


Add operands 
Decimal adjust for addition 


UB 

BB 
DEC | Decrement operand by 1 
NEG Negate operand 
CMP 
DAS 
AAS 


AAS | ASCII Adjust for subtraction 


Multiply Double/Single Precision 


AAD | ASCII adjust before division 

Table 2-2c. String Instructions 
Move byte or Word, Dword string 
NS nput string from |/O space 


Output string to 1/O space 


) 
Compare byte or Word, Dword string 
Scan Byte or Word, Dword string 


LODS Load byte or Word, Dword string 
STOS Store byte or Word, Dword string 


REPE/ 

REPZ Repeat while equal/zero . 
RENE/ | 
REPNZ Repeat while not equal/not zero 


Table 2-2d. Logical Instructions 
_ LOGICALS 


“NOT” operands | 


“Inclusive OR” operands 
“Exclusive OR” operands 


TEST “Test” operands | 


Table 2-2d. Logical Instructions (Continued) 


|SHL/SHR | Shift logical left or right | 
SAL/SAR | Shift arithmetic left or right 


_|Double shift left or right | 
| ROTATES 


ROL/ROR 
RCL/RCR 
Table 2-2e. Bit Manipulation Instructions 


Table 2-2f. Program Control Instructions 
SETCC et byte equal to condition code 
A/JNBE | Jump if above/not below nor equal ~ 
AE/JNB | Jump if above or equal/not below 
B/JNAE | Jump if below/not above nor equal 
BE/JNA | Jump if below or equal/not above 

J 
J 


| 


—_ 
Qa 


: 


qQ 


___[sumpitcary 
JE/JZ i 
G/JNLE 
GE/JNL 
L/JNGE 
LE/JNG 


° ’ 


NE/JNZ | Jump if not equal/not zero 
: Jump if notoverflow  —~— | 
NP/JPO 


CCl ee ee 
QO 


q 


; 


q 


jHfe;/o 
= Zz 


Jump if not parity/parity odd 
Jump if not sign ! 
J 

J 


qq | 
Fi 
” 


i 
i 


ump if overflow _ 


ump if parity/parity even 


3 


P/JPE 
S 


Jump if Sign 


i 


SHIFTS } 
UNCONDITIONAL TRANSFERS 


|LOOPNZ 


JENTER 


ump if equal/zero 
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Table 2-2f. Program Control Instructions 
(Continued) 


CALL Call procedure/task | 


RET Return from procedure | 
MP 


JMP___| Jump 


ITERATION CONTROLS 


q 


: 


Loop if equal/zero 


Loop if not equal/not zero 
JUMP if register CX =0 


jfeoei[ece[c 
O19 0/9 
O;O0/O 
U;00;70 
ZiINM 
m ~ 

~ 
re) 
fe) 
xe) 


CXZ 


Interrupt a 


Table 2-2g. High Level Language Instructions 
BOUND {Check Array Bounds 


Setup Parameter Block for Entering 
Procedure | | 


Leave Procedure 
. Table 2-2h. Protection Model 


LEAVE 


STR 
| 
LAR 
LSL 
VERW Verify Segment for Reading or Writing 
16 bits of CRO) | 


Table 2-2i. Processor Control Instructions 
HLT Halt | 
Wait until BUSY # negated 
ESC Escape 
OCK Lock Bus 


, 
> 
4 


; 
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2.5 ADDRESSING MODES 


2.5.1 Addressing Modes Overview 


The 386 DX provides a total of 11 addressing modes 
for instructions to specify operands. The addressing 
modes are optimized to allow the efficient execution 
of high level languages such as C and FORTRAN, 
and they cover the vast majority of data references 
needed by high-level languages. 


2.5.2 Register and Immediate Modes 


Two of the addressing modes provide for instruc- 
tions that operate on register or immediate oper- 
ands: 


Register Operand Mode: The operand is located 
in one of the 8-, 16- or 32-bit general registers. 


Immediate Operand Mode: The operand is in- 
cluded in the instruction as part of the opcode. 


2.5.3 32-Bit Memory Addressing 
Modes | 


The remaining 9 modes provide a mechanism for 
specifying the effective address of an operand. The 
linear address consists of two components: the seg- 
ment base address and an effective address. The 
effective address is calculated by using combina- 
tions of the following four address elements: 


DISPLACEMENT: An 8-, or 32-bit immediate value, 
- following the instruction. 


BASE: The contents of any general purpose regis- 
ter. The base registers are generally used by compil- 
ers to point to the start of the local variable area. 


_ INDEX: The contents of any general purpose regis- 
ter except for ESP. The index registers are used to 
access the elements of an array, or a string of char- 
acters. 


SCALE: The index register’s value can be multiplied 
by a scale factor, either 1, 2, 4 or 8. Scaled index 
mode is especially useful for accessing arrays or 
structures. 


Combinations of these 4 components make up the 9 
additional addressing modes. There is no perform- 
ance penalty for using any of these addressing com- 
binations, since the effective address calculation is 
pipelined with the execution of other instructions. 
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The one exception is the simultaneous use of Base 
and Index components which requires one addition- 
al clock. 


As shown in Figure 2-9, the effective address (EA) of 
an operand is calculated according to the following 
formula. 


EA= Base Reg + (Index Reg * Scaling) + Displacement 


Direct Mode: The operand’s offset is contained as 
part of the instruction as an 8-, 16- or 32-bit dis- 
placement. 

EXAMPLE: INC Word PTR [500] 


Register Indirect Mode: A BASE register contains 
the address of the operand. 
EXAMPLE: MOV [ECX], EDX 


Based Mode: A BASE register’s contents is added 
to a DISPLACEMENT to form the operands offset. 
EXAMPLE: MOV ECX, [EAX+ 24] 


Index Mode: An INDEX register’s contents is added 
to a DISPLACEMENT to form the operands offset. 
EXAMPLE: ADD EAX, TABLE[ESI] 


Scaled Index Mode: An INDEX register’s. contents is 
multiplied by a scaling factor which is added to a 
DISPLACEMENT to form the operands offset. 
EXAMPLE: IMUL EBX, TABLE[ESI*4],7 


Based Index Mode: The contents of a BASE register 
is added to the contents of an INDEX register to 
form the effective address of an operand. 
EXAMPLE: MOV EAX, [ESI] [EBX] 


Based Scaled Index Mode: The contents of an IN- 
DEX register is multiplied by a SCALING factor and 
the result is added to the contents of a BASE regis- 
ter to obtain the operands offset. 

EXAMPLE: MOV ECX, [EDX*8] [EAX] 


Based Index Mode with Displacement: The contents 
of an INDEX Register and a BASE register’s con- 
tents and a DISPLACEMENT are all summed to- 
gether to form the operand offset. > . 
EXAMPLE: ADD EDX, [ESI] [EBP + OOFFFFFOH) 


Based Scaled Index Mode with Displacement: The 
contents of an INDEX register are multiplied by a 
SCALING factor, the result is added to the contents 
of a BASE register and a DISPLACEMENT to form — 


_ the operand’s offset. 


EXAMPLE: MOV EAX, LOCALTABLE[EDI*4] 
[EBP + 80] 
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SEGMENT REGISTER’ 


- DESCRIPTOR REGISTERS 


EFFECTIVE 
ADDRESS 


LINEAR 


6 ADDRESS 


BASE REGISTER 
INDEX REGISTER 
S os , 

SCALE 
1,2,4,0R8 | . 
DISPLACEMENT 
(IN INSTRUCTION) 
TARGET ADDRESS = 


SEGMENT BASE ADDRESS 


SEGMENT 
LIMIT 


SELECTED 
SEGMENT 


231630-51 


Figure 2-9. Addressing Mode Calculations 


2.5.4 Differences Between 16 and 32 
Bit Addresses 


In order to provide software compatibility with the 
80286 and the 8086, the 386 DX can execute 16-bit 
instructions in Real and Protected Modes. The proc- 
essor determines the size of the instructions it is ex- 
ecuting by examining the D bit in the CS segment 
Descriptor. If the D bit is 0 then all operand lengths 
and effective addresses are assumed to be 16 bits 
long. If the D bit is 1 then the default length for oper- 
ands and addresses is 32 bits. In Real Mode the 
default size for operands and addresses is 16-bits. 


Regardless of the default precision of the operands 
or addresses, the 386 DX is able to execute either 
16 or 32-bit instructions. This is specified via the use 
of override prefixes. Two prefixes, the Operand Size 
Prefix and the Address Length Prefix, override the 
value of the D bit on an individual instruction basis. 
These prefixes are ae any added by Intel as- 
semblers. 


Example: The processor is executing in.Real Mode 
and the programmer needs to access the EAX regis- 
ters. The assembler code for this might be MOV 
EAX, 32-bit MEMORYOP, ASM386 Macro Assem- 
bler automatically determines that an Operand Size 
Prefix is needed and generates it. 


Example: The D bit is 0, and the programmer wishes 
to use Scaled Index addressing mode to access an 
array. The Address Length Prefix allows the use of 
MOV DX, TABLE[ESI*2]. The assembler uses an 
Address Length Prefix since, with D=0, the default 
addressing mode is 16-bits. | 


Example: The D bit is 1, and the program wants to 
store a 16-bit quantity. The Operand Length Prefix is 


used to specify only a 16-bit value; moN MEM16, 


DX. 
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Table 2-3. BASE and INDEX Registers for 16- and 32-Bit Addresses 


| 16 -Bit Addressing 32-Bit Addressing 


BASE REGISTER BX,BP 
INDEX REGISTER SI,DI 


SCALE FACTOR none 


Any 32-bit GP Register 
Any 32-bit GP Register 
Except ESP 

1,2,4,8 


DISPLACEMENT 


The OPERAND LENGTH and Address Length Pre- 
fixes can be applied separately or in combination to 
any instruction. The Address Length Prefix does not 
allow addresses over 64K bytes to be accessed in 
Real Mode. A memory address which exceeds 
FFFFH will result in a General Protection Fault. An 
Address Length Prefix only allows the use of the ad- 
— ditional 386 DX addressing modes. 


When executing 32-bit code, the 386 DX uses either 
8-, or 32-bit displacements, and any register can be 
used as base or index registers. When executing 16- 
bit code, the displacements are either 8, or 16 bits, 
and the base and index register conform to the 
80286 model. Table 2-3 illustrates the differences. 


2.6 DATA TYPES 


The 386 DX supports all of the data types commonly 
~ used in high level languages: 


Bit: A single bit quantity. 


Bit Field: A group of up to 32 contiguous bits, 
which spans a maximum of four bytes. 


Bit String: A set of contiguous bits, on the 386 DX 
bit strings can be up to 4 gigabits long. 


Byte: A signed 8-bit quantity. 

| pinsianed Byte: An unsigned 8-bit quantity. 
Integer (Word): A signed 16-bit quantity. 
Long inieaer (Double Word): A signed 32-bit quan- 
tity. All operations assume a 2’s complement rep- 


resentation. 


Unsigned Integer (Word): An unsigned 16-bit 
quantity. 


0, 8, 16 bits 


0, 8, 32 bits 


Unsigned Long Integer (Double Word): An un- 
signed 32-bit quantity. 


Signed Quad Word: A signed 64-bit quantity. 


Unsigned Quad Word: An unsigned 64-bit quanti- 
ty. 


Offset: A 16- or 32-bit offset only quantity which 
indirectly references another memory location. 


Pointer: A full pointer which consists of a 16-bit 
segment selector and either a 16- or 32-bit offset. 


Char: A byte representation of an ASCII Alphanu- 
meric or control character. 


String: A contiguous sequence of bytes, words or 
dwords. A string may contain between 1 byte and 
4 Gbytes. 


BCD: A byte (unpacked) representation of decimal 
digits 0-9. 


Packed BCD: A byte (packed) representation of 
two decimal digits 0-9 storing one digit in each 
nibble. 


When the 386 DX is coupled with a 387 DX Numeri- 
cs Coprocessor then the following common Floating 
Point types are supported. 


Floating Point: A signed 32-, 64-, or 80-bit real 
number representation. Floating point numbers 
are supported by the 387 DX numerics coproces- 
Sor. 


Figure 2-10 illustrates the data types supported by 
the 386 DX and the 387 DX numerics coprocessor. 
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a a | 7 o 7 07 0 
SIGNED z : | BINARY 
SIGN BIT-! . DECIMAL gop © _ BCD BCD 
L_ (BCD) nicitN DIGIT 1 DIGIT O 
MAGNITUDE | ; 
| +N +1 te) 
7 0 ~ GF 0 7. 07 0 
UNSIGNED | | ASCII | 
oo ld | ce] ee Pee] 
a ASCII | ~ ASCII ASCII 
MAGNITUDE } CHARACTER, | CHARACTER, CHARACTER, 
oe ) | +N #1 ) 
1514. 87 +) 7 0 7 07 0 
SIGNED » ty, | PACKED | 
SIGN BITTLEMSB_ Wee | aes 
| MAGNITUDE ~ : SIGNIFICANT DIGIT SIGNIFICANT DIGIT 
+1 0 +N +1 #20 
5 | 0 7 W150 7/15 07/15 0. 
UNSIGNED BYTE | 
| neers . 
| MAGNITUDE 
3 7 -2 GIGABITS 
aaa +2 GIGABITS | 210 
“SIGNED SUBLET BIT 
WORD STRING 
SIGN BIT a BITO . 
MAGNITUDE | 
+3 +2 +1 ) +3 +42. #1 O° 
31 eaerernee i) 
UNSIGNED DOUBLE TT : ORT 
: WORD | 2=BIT 
POINTER 
| a | OFFSET 
Ru re #5 45 +3 +2 #1 «0 | ne 4 +3 42 3+ 1 ) 
4847 3231 1615 or 0 
SIGNED sual LONG F 
a Wor | 
| SIGN BiT-LMsB | | | | 1| 
MAGNITUDE SELECTOR | OFFSET 


#9 48 47 +6 45 44 +3 42 +1 =«20 
79 | 


FLOATING 
POINT* 


SIGN BIT) 


EXPONENT MAGNITUDE 
+5 +4 +3 #2 #1 0 
32=BIT 
BIT FIELD | *SUPPORTED BY 80387 
NUMERIC DATA 
|}~—_—_—_—_—— BIT FiELD > 
aay Se COPROCESSOR 


| 231630-52 
Figure 2-10. 386™ DX Supported Data Types 
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2.7 MEMORY ORGANIZATION 


2./.1 Introduction 


Memory on the 386 DX is divided up into 8-bit quan- 
tities (bytes), 16-bit quantities (words), and 32-bit 
quantities (dwords). Words are stored in two consec- 
utive bytes in memory with the low-order byte at the 
lowest address, the high order byte at the high ad- 
dress. Dwords are stored in four consecutive bytes 
in memory with the low-order byte at the lowest ad- 
dress, the high-order byte at the highest address. 


The address of a word or dword is the byte address _ 


of the low-order byte. 


In addition to these basic data types, the 386 DX 
supports two larger units of memory: pages and seg- 
ments. Memory can be divided up into one or more 
variable length segments, which can be swapped to 
disk or shared between programs. Memory can also 
be organized into one or more 4K byte pages. Final- 
ly, both segmentation and paging can be combined, 
gaining the advantages of both systems. The 386 
DX supports both pages and segments in order to 
provide maximum flexibility to the system designer. 
Segmentation and paging are complementary. Seg- 
mentation is useful for organizing memory in logical 
modules, and as such is a tool for the application 
programmer, while pages are useful for the system 
programmer for managing the physical memory of a 
system. 


2.1.2 Address Spaces 


The 386 DX has three distinct address spaces: 
logical, linear, and physical. A logical address 


EFFECTIVE ADDRESS CALCULATION 


DISPLACEMENT 


32 , EFFECTIVE 
ADDRESS 


15 2 0 LOGICAL OR 


SEGMENTATION 
ir 14 VIRTUAL ADDRESS) | UNIT 
SELECTOR } P 
L DESCRIPTOR 
INDEX 


SEGMENT 
REGISTER 


LINEAR | (OPTIONAL USE) 
ADDRESS 
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(also known as a virtual address) consists of a se- 
lector and an offset. A selector is the contents of a 
segment register. An offset is formed by summing all 
of the addressing components (BASE, INDEX, DIS- 
PLACEMENT) discussed in section 2.5.3 Memory 
Addressing Modes into an effective address. Since 
each task on 386 DX has a maximum of 16K (214 
—1) selectors, and offsets can be 4 gigabytes, (232 
bits) this gives a total of 246 bits or 64 terabytes of 
logical address space per task. The programmer 
sees this virtual address space. 


The segmentation unit translates the logical ad- 
dress space into a 32-bit linear address space. If the 
paging unit is not enabled then the 32-bit linear ad- 
dress corresponds to the physical address. The 
paging unit translates the linear address space into 
the physical address space. The physical address 
is what appears on the address pins. 


The primary difference between Real Mode and Pro- 
tected Mode is how the segmentation unit performs 
the translation of the logical address into the linear 
address. In Real Mode, the segmentation unit shifts 
the selector left four bits and adds the result to the 
Offset to form the linear address. While in Protected 
Mode every selector has a linear base address as- 
sociated with it. The linear base address is stored in 
one of two operating system tables (i.e. the Local 
Descriptor Table or Global Descriptor Table). The 
selector’s linear base address is added to the offset 
to form the final linear address. 


Figure 2-11 shows the relationship between the vari- 
Ous address spaces. 


PHYSICAL 
MEMORY 


PAGING UNIT : 
PHYSICAL | 
ADDRESS 


231630-53 


Figure 2-11. Address Translation 
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2.7.3 Segment Register Usage 


The main data structure used to organize memory is 


the segment. On the 386 DX, segments are variable 
sized blocks of linear addresses which have certain 
attributes associated with them. There are two main 
types of segments: code and data, the segmenis are 
of variable size and can be as small as 1 byte or as 
large as 4 gigabytes (292 bytes). : | 


In order to provide compact instruction encoding, 
and increase processor performance, instructions 
do not need to explicitly specify which segment reg- 
ister is used. A default segment register is automati- 
cally chosen according to the rules of Table 2-4 
‘(Segment Register Selection Rules). In general, data 
references use the selector contained in the DS reg- 
ister; Stack references use the SS register and In- 


struction fetches use the CS register. The contents | 


of the Instruction Pointer provides the offset. Special 
segment override prefixes allow the explicit use of a 
given segment register, and override the implicit 


rules listed in Table 2-4. The override prefixes also | 


allow the use of the ES, FS and GS segment regis- 
ters. | | : : 


Code Fetch me te 


Destination of PUSH, PUSHF, INT, 
CALL, PUSHA Instructions 
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There are no restrictions regarding the overlapping 
of the base addresses of any segments. Thus, all 6 
segments could have the base address set to zero 
and create a system with a four gigabyte linear ad- 
dress space. This creates a system where the virtual 


address space is the same as the linear address 


2.8 1/0 SPACE 


The 386 DX has two distinct physical address 
spaces: Memory and !/O. Generally, peripherals are 
placed in I/O space although the 386 DX also sup- 
ports memory-mapped peripherals. The |/O space 
consists of 64K bytes, it can be divided into 64K 
8-bit ports, 32K 16-bit ports, or 16K 32-bit ports, or 
any combination of ports which add up to less than 
64K bytes. The 64K I/O address space -refers to 
physical memory rather than linear address since |/ 
O instructions do not go through the segmentation 
or paging hardware. The M/IlO# pin acts as an addi- 
tional address line thus allowing the system designer 
to easily determine which address space the proces- 
sor is accessing. 


Source of POP, POPA, POPF, SS 
IRET, RET instructions : 


Destination of STOS, MOVS, REP 
STOS, REP MOVS Instructions 
(DI is Base Register) 


Other Data References, with 
Effective Address Using Base 
Register of: 

[EAX] 

[EBX] 

[ECX] 

[EDX] 

[ESI] 

[EDI] 

[EBP] 

[ESP] 


_DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 

_ DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 
DS,CS,SS,ES,FS,GS 
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The I/O ports are accessed via the IN and OUT I/O 
instructions, with the port address supplied as an 
immediate 8-bit constant in the instruction or in the 
DX register. All 8- and 16-bit port addresses are zero 
extended on the upper address lines. The |/O in- 
structions cause the M/IO# pin to be driven low. 


I/O port addresses OOF8H through OOFFH are re- 
served for use by Intel. 


2.9 INTERRUPTS 


2.9.1 Interrupts and Exceptions 


Interrupts and exceptions alter the normal program 
flow, in order to handle external events, to report 
errors or exceptional conditions. The difference be- 
tween interrupts and exceptions is that interrupts are 
used to handle asynchronous external events while 
exceptions handle instruction faults. Although a pro- 
gram can generate a software interrupt via an INT N 
instruction, the processor treats software interrupts 
as exceptions. 


Hardware interrupts occur as the result of an exter- 
nal event and are classified into two types: maskable 
or non-maskable. Interrupts are serviced after the 
execution of the current instruction. After the inter- 
rupt handler is finished servicing the interrupt, exe- 
cution proceeds with the instruction immediately 
after the interrupted instruction. Sections 2.9.3 and 
2.9.4 discuss the differences between Maskable and 
Non-Maskable interrupts. 


Exceptions are classified as faults, traps, or aborts 
depending on the way they are reported, and wheth- 
er or not restart of the instruction causing the excep- 
_tion is supported. Faults are exceptions that are de- 
tected and serviced before the execution of the 
faulting instruction. A fault would occur in a virtual 
memory system, when the processor referenced a 
page or a segment which was not present. The oper- 
ating system would fetch the page or segment from 
disk, and then the 386 DX would restart the instruc- 
tion. Traps are exceptions that are reported immedi- 
ately after the execution of the instruction which 
caused the problem. User defined interrupts are ex- 
amples of traps. Aborts are exceptions which do 
not permit the precise location of the instruction 
causing the exception to be determined. Aborts are 
used to report severe errors, such as a hardware 
error, or illegal values in system tables. 


Thus, when an interrupt service routine has been 
completed, execution proceeds from the instruction 
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immediately following the interrupted instruction. On 
the other hand, the return address from an excep- 
tion fault routine will always point at the instruction 
causing the exception and include any leading in- 
struction prefixes. Table 2-5 summarizes the possi- 
ble interrupts for the 386 DX and shows where the 
return address points. 


The 386 DX has the ability to handle up to 256 differ- 
ent interrupts/exceptions. In order to service the in- 
terrupts, a table with up to 256 interrupt vectors 
must be defined. The interrupt vectors are simply 
pointers to the appropriate interrupt service routine. 
In Real Mode (see section 3.1), the vectors are 4 
byte quantities, a Code Segment plus a 16-bit offset; 
in Protected Mode, the interrupt vectors are 8 byte 
quantities, which are put in an Interrupt Descriptor 
Table (see section 4.1). Of the 256 possible inter- 
rupts, 32 are reserved for use by Intel, the remaining 
224 are free to be used by the system designer. 


| 
2.9.2 interrupt Processing 


When an interrupt occurs the following actions hap- 
pen. First, the current program address and the 
Flags are saved on the stack to allow resumption of 
the interrupted program. Next, an 8-bit vector is sup- 
plied to the 386 DX which identifies the appropriate 
entry in the interrupt table. The table contains the 
starting address of the interrupt service routine. 
Then, the user supplied interrupt service routine is 
executed. Finally, when an IRET instruction is exe- 
cuted the old processor state is restored and pro- 
gram execution resumes at the appropriate instruc- 
tion. 


The 8-bit interrupt vector is supplied to the 386 DX in 
several different ways: exceptions supply the inter- 
rupt vector internally; software INT instructions con- 
tain or imply the vector; maskable hardware inter- 
rupts supply the 8-bit vector via the interrupt ac- 
knowledge bus sequence. Non-Maskable hardware 
interrupts are assigned to interrupt vector 2. 


2.9.3 Maskable Interrupt 


Maskable interrupts are the most common way used 
by the 386 DX to respond to asynchronous external 
hardware events. A hardware interrupt occurs when 
the INTR is pulled high and the Interrupt Flag bit (IF) 
is enabled. The processor only responds to inter- 
rupts between instructions, (REPeat String instruc- 
tions, have an “interrupt window”, between memory 
moves, which allows interrupts during long 
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Interrupt | 
Number 


Divide Error 


Table 2-5. Interrupt Vector Assignments 


Instruction Which - 
Can Cause 
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Return Address | 
Points to 
Faulting 


Exception Instruction 


e 


| Any Illegal Instruction 
ESC, WAIT 


Any Instruction That Can 
| Generate an Exception 
| 9 {esc 


Intel Reserved 


Two Byte Interrupt 


* Some debug exceptions may report both traps on the previous instruction,’and faults on the next instruction. 


string moves). When an interrupt occurs the proces- 
sor reads an 8-bit vector supplied by the hardware 
which identifies the source of the interrupt, (one of 
224 user defined interrupts). The exact nature of the 
interrupt. sequence is discussed in section 5. 


The IF bit in the EFLAG registers is reset when an 


_ interrupt is being serviced. This effectively disables 
servicing additional interrupts during an interrupt 
service routine. However, the IF may be set explicitly 
by the interrupt handier, to allow the nesting of inter- 
rupts.. When an IRET instruction is executed the 
original state of the IF is restored. 


2.9.4 Non-Maskable Interrupt 


Non-maskable interrupts provide a method of servic- 
ing very high priority interrupts. A common example 
of the use of a non-maskable interrupt (NMI) would 
be to activate a power failure routine. When the NMI 


input is pulled high it causes an interrupt with an 
internally supplied vector value of 2. Unlike a normal 
hardware interrupt, no interrupt acknowledgment se- 


‘quence is performed for an NMI. 


While executing the NMI servicing procedure, the 
386 DX will not service further NMI requests, until an 
interrupt return (IRET) instruction is executed or the 
processor is reset. If NMI occurs while currently 
servicing an NMI, its presence will be saved for serv- 
icing after executing the first IRET instruction. The IF 


bit is cleared at the beginning of an NMI interrupt to 
inhibit further INTR interrupts. — : 


2.9.5 Software Interrupts © 


A third type of interrupt/exception for the 386 DX is 
the software interrupt. An INT n instruction causes 
the processor to execute the interrupt service rou- 
tine pointed to by the nth vector in the interrupt ta- 
ble. 
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A special case of the two byte software interrupt INT 
n is the one byte INT 3, or breakpoint interrupt. By 
inserting this one byte instruction in a program, the 
user can set breakpoints in his program as a debug- 
ging tool. 


A final type of software interrupt, is the single step 
interrupt. It is discussed in section 2.12. 


2.9.6 Interrupt and Exception 
Priorities 


Interrupts are externally-generated events. Maska- 
ble Interrupts (on the INTR input) and Non-Maskable 
Interrupts (on the NMI input) are recognized at in- 
struction boundaries. When NMI and maskable 
INTR are both recognized at the same instruction 
boundary, the 386 DX invokes the NMI service rou- 
tine first. If, after the NMI service routine has been 
invoked, maskable interrupts are still enabled, then 
the 386 DX will invoke the appropriate interrupt serv- 
ice routine. 


Table 2-6a. 386™ DX Priority for 
Invoking Service Routines in Case of 
Simultaneous External interrupts 


1. NMI 
2. INTR 


Exceptions are internally-generated events. Excep- 
tions are detected by the 386 Dx if, in the course of 
executing an instruction, the 386 DX detects a prob- 
lematic condition. The 386 DX then immediately in- 
vokes the appropriate exception service routine. The 
state of the 386 DX is such that the instruction caus- 
ing the exception can be restarted. If the exception 
service routine has taken care of the problematic 
condition, the instruction will execute without caus- 
ing the same exception. 


It is possible for a single instruction to generate sev- 
eral exceptions (for example, transferring a single 
operand could generate two page faults if the oper- 
and location spans two “not present’ pages). How- 
ever, only one exception is generated upon each at- 
tempt to execute the instruction. Each exception 
service routine should correct its corresponding ex- 
_ ception, and restart the instruction. In this manner, 
exceptions are serviced until the instruction exe- 
cutes successfully. 


As the 386 DX executes instructions, it follows a 
consistent cycle in checking for exceptions, as 
shown in Table 2-6b. This cycle is repeated 
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as each instruction is executed, and occurs in paral- 
lel with instruction decoding and execution. 


Table 2-6b. Sequence of Exception Checking 


Consider the case of the 386 DX having just 
completed an instruction. It then performs the 
following checks before reaching the point where 
the next instruction is completed: 


1. Check for Exception 1 Traps from the instruc- 
tion just completed (single-step via Trap Flag, 
or Data Breakpoints set in the Debug Regis- 
ters). 


. Check for Exception 1 Faults in the next in- 
struction (Instruction Execution Breakpoint set 
in the Debug Registers for the next instruc- 
tion). 

. Check for external NMI and INTR. 


. Check for Segmentation Faults that prevented 
fetching the entire next instruction (exceptions 
11 or 13). 


. Check for Page Faults that prevented fetching 
the entire next instruction (exception 14). 


. Check for Faults decoding the next instruction 
(exception 6 if illegal opcode; exception 6 if in 
Real Mode or in Virtual 8086 Mode and at- 
tempting to execute an instruction for Protect- 
ed Mode only (see 4.6.4); or exception 13 if | 
instruction ts longer than 15 bytes, or privilege 
violation in Protected Mode (i.e. not at IOPL or 
at CPL=0). 


. If WAIT opcode, check if TS=1 and MP=1 
(exception 7 if both are 1). 


. lf ESCAPE opcode for numeric coprocessor, | 
check if EM=1 or TS=1 (exception 7 if either 
are 1). 


_ If WAIT opcode or ESCAPE opcode for nu- 
meric coprocessor, check ERROR # input sig- 
nal (exception 16 if ERROR# input is assert- 
ed). 


10. Check in the following order for each. memo- | 
ry reference required by the instruction: 


a. Check for Segmentation Faults that pre- 
vent transferring the entire memory quanti- 
ty (exceptions 11, 12, 13).. 


b. Check for Page Faults that prevent trans- 
ferring the entire memory quantity (excep- 
tion 14). 


Note that the order stated supports the concept 

of the paging mechanism being “underneath” 

the segmentation mechanism. Therefore, for any 

given code or data reference in memory, seg- | 
mentation exceptions are generated before pag- 

ing exceptions are generated. 
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2.9.7 Instruction Restart 


The 386 DX fully supports restarting all instructions 
after faults. If an exception is detected in the instruc- 
tion to be executed (exception categories 4 through 
10 in Table 2-6b), the 386 DX invokes the appropri- 
ate exception service routine. The 386 DX is ina 
state that permits restart of the instruction, for all 
cases but those in Table 2-6c. Note that all such 
cases are easily avoided by proper design of the 
operating system. 


Table 2-6c. Conditions Preventing 
Instruction Restart 


A. An instruction causes a task switch to a task 
whose Task State Segment is partially “not 
present’. (An entirely “not present” TSS is re- 
startable.) Partially present TSS’s can be 
avoided either by keeping the TSS’s of such 
tasks present in memory, or by aligning TSS 
segments to reside entirely within a single 4K 
page (for TSS segments of 4K bytes or less). 


. A coprocessor operand wraps around the top 
of a 64K-byte segment or a 4G-byte segment, 

_ and spans three pages, and the page holding 
the middle portion of the operand is ‘not pres- 
ent.” This condition can be avoided by starting 
at a page boundary any segments containing 
coprocessor operands if the segments are ap- 
proximately 64K-200 bytes or larger (i.e. large 
enough for wraparound of the coprocessor 
operand to possibly occur). 


Note that these conditions are avoided by using 
the operating system designs mentioned in this 
table. 


2.9.8 Double Fault 


A Double Fault (exception 8) results when the proc- 


essor attempts to invoke an exception service rou- | 


tine for the segment exceptions (10, 11, 12 or 13), 
but in the process of doing so, detects an exception 
other than a Page Fault (exception 14). 


A Double Fault (exception 8) will also be generated 
when the processor attempts to invoke the Page 
Fault (exception 14) service routine, and detects an 
exception other than a second Page Fault. In any 
functional system, the entire Page Fault service rou- 
tine must remain “present” in memory. 


Double page faults however do not raise the double 
fault exception. If a second page fault occurs while 
the processor is attempting to enter the service rou- 
tine for the first time, then the processor will invoke 


| Stack Segment 
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the page fault (exception 14) handier a second time, 
rather than the double fault (exception 8) handler. A 
subsequent fault, though, will lead to shutdown. 


When a Double Fault occurs, the 386 DX invokes 
the exception service routine for exception 8. 


2.10 RESET AND INITIALIZATION 


When the processor is initialized or Reset the regis- 
ters have the values shown in Table 2-7. The 386 
DX will then start executing instructions near the top 
of physical memory, at location FFFFFFFOH. When 
the first InterSegment Jump or Call is executed, ad- 
dress lines A20-31 will drop low: for CS-relative 
memory cycles, and the 386 DX will only execute 
instructions in the lower one megabyte of physical 
memory. This allows the system designer to use a 
ROM at the top of physical memory to initialize. the 
system and take care of Resets. 


RESET forces the 386 DX to terminate all execution 
and local bus activity. No instruction execution or 
bus activity will occur as long as Reset is active. 
Between 350 and 450 CLK2 periods after Reset be- 
comes inactive the 386 DX will start executing in- 
structions at the top of physical memory. 


Table 2-7. Register Values after Reset 


Flag Word UUUU0002H Note 1 
Machine Status Word (CRO) | JUUUUUU0H Note 2 
Instruction Pointer — OOOOFFFOH 3 
Code Segment FOOOH Note 3} 
Data Segment 0000H 
0000H 
0000H 
~Q000H 
0000H 
component and 
stepping ID Note 5 
undefined Note 4 


Extra Segment (ES) 
Extra Segment (FS) 
Extra Segment (GS) 
DX register 


All other registers 


NOTES: 

1. EFLAG Register. The upper 14 bits of the EFLAGS reg- 
ister are undefined, VM (Bit 17) and RF (BIT) 16 are 0 as 
are all other defined flag bits. 

2. CRO: (Machine Status Word). All of the defined fields in 
the CRO are 0 (PG Bit 31, TS Bit 3, EM Bit 2, MP Bit 1, and 
PE Bit 0). 

3. The Code Segment Register (CS) will have its Base Ad- 
dress set to FFFFOOOOH and Limit set to OFFFFH. 

4. All undefined bits are Intel Reserved and should not be 
used. 


5. DX register always holds component and stepping iden- 


tifier (see 5.7). EAX register holds self-test signature if self- 
test was requested (see 5.6). 
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2.11 TESTABILITY 


2.11.1 Self-Test 


The 386 DX has the capability to perform a self-test. 
The self-test checks the function of all of the Control 
ROM and most of the non-random logic of the part. 
Approximately one-half of the 386 DX can be tested 
during self-test. 


Self-Test is initiated on the 386 DX when the RESET 
pin transitions from HIGH to LOW, and the BUSY # 
pin is low. The self-test takes about 2**19 clocks, or 
approximately 26 milliseconds with a 20 MHz 386 
DX. At the completion of self-test the processor per- 
forms reset and begins normal operation. The part 
has successfully passed self-test if the contents of 
the EAX register are zero (0). If the results of EAX 
are not zero then the self-test has detected a flaw in 
the part. 


2.11.2 TLB Testing 


The 386 DX provides a mechanism for testing the 
Translation Lookaside Buffer (TLB) if desired. This 
particular mechanism is unique to the 386 DX and 
may not be continued in the same way in future 
processors. When testing the TLB paging must be 
turned off (PG = 0 in CRO) to enable the TLB test- 
ing hardware and avoid interference with the test 
data being written to the TLB. 


There are two TLB testing operations: 1) write en- 
tries into the TLB, and, 2) perform TLB lookups. Two 
Test Registers, shown in Figure 2-12, are provided 
for the purpose of testing. TR6 is the “test command 
register’, and TR7 is the “test data register’. The 
fields within these registers are defined below. 


C: This is the command bit. For a write into TR6 to 
cause an immediate write into the TLB entry, write a 
0 to this bit. For a write into TR6 to cause an immedi- 
ate TLB lookup, write a 1 to this bit. 


Linear Address: This is the tag field of the TLB. On 
a TLB write, a TLB entry is allocated to this linear 
address and the rest of that TLB entry is set per the 
value of TR7 and the value just written into TR6. On 
a TLB lookup, the TLB is interrogated per this value 
and if one and only one TLB entry matches, the rest 
of the fields of TR6 and TR7 are set from the match- 
ing TLB entry. 


Physical Address: This is the data field of the TLB. 
On a write to the TLB, the TLB entry allocated to the 
linear address in TR6 is set to this value. On a TLB 
lookup, the data field (physical address) from the 
TLB is read out to here. 
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PL: On a TLB write, PL=1 causes the REP field of 
TR7 to select which of four associative blocks of the 
TLB is to be written, but PL=0 allows the internal 
pointer in the paging unit to select which TLB block 
is written. On a TLB lookup, the PL bit indicates 
whether the lookup was a hit (PL gets set to 1) ora 
miss (PL gets reset to 0). 


V: The valid bit for this TLB entry. All valid bits can 
also be cleared by writing to CR3. 


-D, D#: The dirty bit for/from the TLB entry. 


U, U#: The user bit for/from the TLB entry. 
W, W#: The writable bit for/from the TLB entry. 


For D, U and W, both the attribute and its comple- 
ment are provided as tag bits, to permit the option of 
a “don’t care” on TLB lookups. The meaning of 
these pairs of bits is given in the following table: 


X# Effect During Value of Bit 
TLB Lookup X after TLB Write. 


Miss All Bit X Becomes Undefined 
: Match if X = 0 Bit X Becomes 0 
1/ O | Match if X = 1 Bit X Becomes 1 


Match all Bit X Becomes Undefined 


For writing a TLB entry: 


1. Write TR7 for the desired physical address, PL 
and REP values. 

2. Write TR6 with the appropriate linear address, 
etc. (be sure to write C = O for “write” com- 
mand). 


For looking up (reading) a TLB entry: 
1. Write TR6 with the appropriate linear address (be 
sure to write C= 1 for “lookup”? command). 


2. Read TR7 and TRE. If the PL bit in TR7 indicates 
a hit, then the other values reveal the TLB con- 
tents. If PL indicates a miss, then the other values 
in TR7 and TR6 are indeterminate. 


2.12 DEBUGGING SUPPORT 


The 386 DX provides several features which simplify 

the debugging process. The three categories of on- 

chip debugging aids are: 

1) the code execution breakpoint opcode (OCCH), 

2) the single-step capability provided by the TF bit in 
the flag register, and 


3) the code and data breakpoint capability provided 
by the Debug Registers DRO-3, DR6, and DR7. 
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Figure 2-12. Test Registers 


2.12.1 Breakpoint Instruction 


A single-byte-opcode breakpoint instruction is avail- 
able for use by software debuggers. The breakpoint 
opcode is OCCh, and generates an exception 3 trap 
when executed. In typical use, a debugger program 
can “plant” the breakpoint instruction at all desired 
code execution breakpoints. The single-byte break- 
point opcode is an alias for the two-byte general 
software interrupt instruction, INT n, where n=8. 
The only difference between INT 3 (OCCh) and INT n 
is that INT 3 is never IOPL-sensitive but INT n is 
lOPL-sensitive in Protected Mode and Virtual 8086 
Mode. 


2.12.2 Single-Step Trap 


If the single-step flag (TF, bit 8) in the EFLAG regis- 
ter is found to be set at the end of an instruction, a 
single-step exception occurs. The single-step ex- 
ception is auto vectored to exception number 1. Pre- 
cisely, exception 1 occurs as a trap after the instruc- 
tion following the instruction which set TF. In typical 
practice, a debugger sets the TF bit of a flag register 
image on the debugger’s stack. It then typically 
transfers control to the user program and loads the 
flag image with a signal instruction, the IRET instruc- 
tion. The single-step trap occurs after executing one 
instruction of the user program. 


Since the exception 1 occurs as a trap (that is, it 
occurs after the instruction has already executed), 


the CS:EIP pushed onto the debugger’s stack points . 


to the next unexecuted instruction of the program 
being debugged. An exception 1 handler, merely by 
ending with an IRET instruction, can therefore effi- 
ciently support single- spend through a user Be: 
gram. 


2.12.3 Debug Registers 


The Debug Registers are an advanced debugging 
feature of the 386 DX. They allow data access 
breakpoints as well as code execution breakpoints. 
Since the breakpoints are indicated by on-chip regis- 
ters, an instruction execution breakpoint can be 


placed in ROM code or in code shared by several 
tasks, neither of which can be supported by the INT3 
breakpoint opcode. 


The 386 DX contains six Debug Registers, providing 
the ability to specify up to four distinct breakpoints 
addresses, breakpoint control options, and read 
breakpoint status. Initially after reset, breakpoints 
are in the disabled state. Therefore, no breakpoints 
will occur unless the debug registers are pro- 
grammed. Breakpoints set up in the Debug Regis- 
ters are autovectored to exception number 1. 


2.12.3.1 LINEAR ADDRESS BREAKPOINT | 
REGISTERS (DRO-DR3) 


Up to four breakpoint addresses can be specified by 


writing into Debug Registers DRO—DR3, shown in 
_ Figure 2-13. The breakpoint addresses specified are 


32-bit linear addresses. 386 DX hardware continu- 
ously compares the linear breakpoint addresses in 
DRO-DR3 with the linear addresses generated by 
executing software (a linear address is the result of 
computing the effective address and adding the 
32-bit segment base address). Note that if paging is 
not enabled the linear address equals the physical 
address. If paging is enabled, the linear address is 
translated to a physical 32-bit address by the on- 
chip paging unit. Regardless of whether paging is 
enabled or not, however, the breakpoint registers 
hold linear addresses. 


2.12.3.2 DEBUG CONTROL REGISTER (DR7) 


A Debug Control Register, DR7 shown in Figure 
2-13, allows several debug control functions such as 
enabling the breakpoints and setting up other con-. 
trol options for the breakpoints. The fields within the 
Debug Control Register, DR7, are as follows: 


LENi (breakpoint length specification bits) 


A 2-bit LEN field exists for each of the four break- 
points. LEN specifies the length of the associated 
breakpoint field. The choices for data breakpoints 
are: 1 byte, 2 bytes, and 4 bytes. Instruction execu- 
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31 


0 


Figure 2-13. 


tion breakpoints must have a length of 1 (LENi = 
00). Encoding of the LENi field is as follows: 


Usage of Least 
Breakpoint Significant Bits in 
Encoding | Field Width | Breakpoint Address 
| Register i, (i= 0—3) 


00 | 1 byte All 32-bits used to 
specify a single-byte 
| | breakpoint field. 


| Undefined— 
do not use 
this encoding 


11 4 bytes A2-A31 used to 
specify a four-byte, 
dword-aligned 

| breakpoint field. AO 
and A1 in Breakpoint 
| Address Register are 
not used. 


31 
The LENi field controls the size of breakpoint field i 
by controlling whether all low-order linear address | 00000008H 
bits in the breakpoint address register are used to <— _bkpt fid2 00000004H 


| 01 | 2 bytes Ai-A31 used to 
| specify a two-byte, 
| word-aligned 
breakpoint field. AO in 
Breakpoint Address 
| | Register is not used. - 
an 


detect the breakpoint. event. Therefore, all break- 


Debug Registers 


The following is an example of various size break- 
point fields. Assume the breakpoint linear address in 
DR2 is OOOOOO0S5H. In that situation, the following 
illustration indicates the region of the breakpoint 
field for lengths of 1, 2, or 4 bytes. 


DR2=00000005H; LEN2 = 00B 
0 


dl 00000008H 
bkpt fld2 00000004H 
ae 00000000H 


DR2=00000005H; LEN2 = 01B 
0 


aa) 00000008H 


<— bkpt fid2 —>» |00000004H 


00000000H 


DR2=00000005H; LEN2 = 11B 


point fields are aligned; 2-byte breakpoint fields be- ae ae ae aa 00000000H 


gin on Word boundaries, and 4-byte breakpoint 


fields begin on Dword boundaries. 
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RWi (memory access qualifier bits) 


A 2-bit RW field exists for each of the four break- 
points. The 2-bit RW field specifies the type of usage 
which must occur in order to activate the associated 


oreakooint, 


RW Usage 
ae Causing Breakpoint 


Instruction execution only 
Data writes only 
Undefined—do not use this encoding 
Data reads and writes only 


RW encoding 00 is used to set up an instruction 
execution breakpoint. RW encodings 01 or 11 are 
used to set up write-only or read/write data break- 
points. 


Note that instruction execution breakpoints are 
taken as faults (i.e. before the instruction exe- 
cutes), but data breakpoints are taken as traps 
(i.e. after the data transfer takes place). 


Using LENi and RWi to Set Data Breakpoint i 


A data breakpoint can be set up by writing the linear 
address into DRi (i = O-3). For data breakpoints, 
RWi can = 01 (write-only) or 11 oar LEN 
can = 00, 01, or 11. 


If a data access entirely or partly falls within the data 
breakpoint field, the data breakpoint condition has 
- occurred, and if the breakpoint is enabled, an excep- 
tion 1 trap will occur. ; 


Using LENi and RWi to Set Instruction Execution — 


Breakpoint i 


An instruction execution breakpoint can be set up by 
writing address of the beginning of the instruction 
(including prefixes if any) into DRi (i = O-3). RWi 
must = 00 and LEN must = 00 for instruction exe- 
cution breakpoints. 


If the instruction beginning at the breakpoint address 
is about to be executed, the instruction execution 
breakpoint condition has occurred, and if the break- 
point is enabled, an exception 1 fault will occur be- 
fore the instruction is executed. 


Note that an instruction execution breakpoint ad- 
dress must be equal to the beginning byte address 
_of an instruction (including prefixes) in order for the 
instruction execution breakpoint to occur. 


GD (Global Debug Register access detect) 


The Debug Registers can only be accessed in Real 
-Mode or at privilege level 0 in Protected Mode. The 


tasks that use 
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GD bit, when set, provides extra protection against 
any Debug Register access even in Real Mode or at 
privilege level:O0 in Protected Mode. This additional 
protection feature is provided to guarantee that a 
software debugger (or ICE™- -386) can have full con- 


than PVA Damiatar raAaAnirn aha 
iret, over uic Debug Negistei rcouUUlceO Weiter ro- 


- quired. The GD bit, when set, causes an exception 1 


fault if an instruction attempts to read or write any 
Debug Register. The GD bit is then automatically 
cleared when the exception 1 handler is invoked, 
allowing the exception 1 panes free access to the - 
debug Fodister=: 


GE and LE (Exact data breakpoint match, slob and 


local) 


If either GE or LE is set, any data breakpoint trap wil 
be reported exactly after completion of the instruc- 
tion that caused the operand transfer. Exact report- 
ing is provided by forcing the 386 DX execution unit — 
to wait for completion of data operand transfers be- 
fore beginning execution of the next instruction. 


lf exact data breakpoint match is not selected, data 
breakpoints may not be reported until several in- 
structions later or may not be reported at all. When 
enabling a data breakpoint, it is therefore recom- 


mended to enable the exact data breakpoint match. 


When the 386 DX performs a task switch, the LE bit 
is cleared. Thus, the LE bit supports fast task switch- 
ing out of tasks, that have enabled the exact data 
breakpoint match for their task-local breakpoints. 
The LE bit is cleared by the processor during a task 
switch, to avoid having exact data breakpoint match 
enabled in the new task. Note that exact data break- 
point match must be re-enabled under software con- 
trol. | 


The 386 DX GE bit is unaffected during a task 
switch. The GE bit supports exact data breakpoint 
match that is to remain enabled during all tasks exe- 
cuting in the system. 


Note that instruction. execution breakpoints are al- 


~ ways reported exactly, whether or not exact data 


breakpoint match is selected. 
Gi and Li (breakpoint enable, global and local) 


If either Gi or Li is set then the associated breakpoint | 
(as defined by the linear address in DRi, the length 
in LENi and the usage criteria in RWi) is enabled. If 
either Gi or Li is set, and the 386 DX detects the ith 
breakpoint condition, then the exception 1 handler is 
invoked. 


When the 386 DX performs a task switch to a new 
Task State Segment (TSS), all Li bits are cleared. 
Thus, the Li bits support fast task switching out of 
some task-local breakpoint 
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registers. The Li bits are cleared by the processor 
during a task switch, to avoid spurious exceptions in 
the new task. Note that the breakpoints must be re- 
enabled under software control. 


All 386 DX Gi bits are unaffected during a task 
switch. The Gi bits support breakpoints that are ac- 
tive in all tasks executing in the system. 


2.12.3.3 DEBUG STATUS REGISTER (DR6) 
A Debug Status Register, DR6 shown in Figure 2-13, 


allows the exception 1 handler to easily determine 
why it was invoked. Note the exception 1 handler 


can be invoked as a result of one of several events: . 


1) DRO Breakpoint fault/trap. 
2) DR1 Breakpoint fault/trap. 
3) DR2 Breakpoint fault/trap. 
4) DR3 Breakpoint fault/trap. 
5) Single-step (TF) trap. 

6) Task switch trap. 


7) Fault due to attempted debug register access 
when GD=1. 


The Debug Status Register contains single-bit flags 
for each of the possible events invoking exception 1. 
Note below that some of these events are fauits (ex- 
ception taken before the instruction is executed), 
while other events are traps (exception taken after 
the debug events occurred). 


The flags in DR6 are set by the hardware but never 
cleared by hardware. Exception 1 handler software 
should clear DR6 before returning to the user pro- 
gram to avoid future confusion in identifying the 
source of exception 1. 


The fields within the Debug Status Register, DR6, 
are as follows: 


Bi (debug fault/trap due to breakpoint 0-3) 


Four breakpoint indicator flags, BO-B3, correspond 
one-to-one with the breakpoint registers in DRO- 
DR3. A flag Bi is set when the condition described 
by DRi, LENi, and RWi occurs. 


lf Gi or Li is set, and if the ith breakpoint is detected, 
the processor will invoke the exception 1 handler. 
The exception is handled as a fault if an instruction 
execution breakpoint occurred, or as a trap if a data 
breakpoint occurred. 


IMPORTANT NOTE: A flag Bi is set whenever the 
hardware detects a match condition on enabled 
breakpoint i. Whenever a match is detected on at 
least one enabled breakpoint i, the hardware imme- 
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diately sets all Bi bits corresponding to breakpoint 
conditions matching at that instant, whether enabled 
or not. Therefore, the exception 1 handler may see 
that multiple Bi bits are set, but only set Bi bits corre- 
sponding to enabled breakpoints (Li or Gi set) are 
true indications of why the exception 1 handler was 
invoked. 


BD (debug fault due to attempted register access 
when GD bit set) 


This bit is set if the exception 1 handler was invoked 
due to an instruction attempting to read or write to 
the debug registers when GD bit was set. If such an 
event occurs, then the GD bit is automatically 
cleared when the exception 1 handler is invoked, 
allowing handler access to the debug registers. 


BS (debug trap due to single-step) 


This bit is set if the exception 1 handler was invoked 
due to the TF bit in the flag register being set (for 
single-stepping). See section 2.12.2. 


BT (debug trap due to task switch) 


This bit is set if the exception 1 handler was invoked 
due to a task switch occurring to a task having a 386 
DX TSS with the T bit set. (See Figure 4-15a). Note 
the task switch into the new task occurs normally, 
but before the first instruction of the task is execut- 
ed, the exception 1 handler is invoked. With respect 
to the task switch operation, the operation is consid- 
ered to be a trap. 


2.12.3.4 USE OF RESUME FLAG (RF) IN FLAG 
REGISTER | 


The Resume Flag (RF) in the flag word can sup- 
press an instruction execution breakpoint when the 
exception 1 handler returns to a user program at a 
user address which is also an instruction execution 
breakpoint. See section 2.3.3. 


3. REAL MODE ARCHITECTURE 


3.1 REAL MODE INTRODUCTION 


When the processor is reset or powered up it is ini- 
tialized in Real Mode. Real Mode has the same base 
architecture as the 8086, but allows access to the 
32-bit register set of the 386 DX. The addressing 
mechanism, memory size, interrupt handling, are all 
identical to the Real Mode on the 80286. 
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MAX LIMIT — 
FIXED Al 64K IN 
REAL MODE 


MEMORY OPERAND 


SEGMENT BASE 


Figure 3-1. Real Address Mode Addressing 


All of the 386 DX instructions are available in Real 
Mode (except those instructions listed in 4.6.4). The 
default operand size in Real Mode is 16-bits, just like 
the 8086. In order to use the 32-bit registers and 
addressing modes, override prefixes must be used. 
In addition, the segment size on the 386 DX in Real 
Mode is 64K bytes so 32-bit effective addresses 
must have a value less the OOOOFFFFH. The primary 
purpose of Real Mode is to set up the processor for 
Protected Mode Operation. 


The LOCK prefix on the 386 DX, even in Real Mode, 
is more restrictive than on the 80286. This is due to 
the addition of paging on the 386 DX in Protected 
Mode and Virtual 8086 Mode. Paging makes it im- 
possible to guarantee that repeated string instruc- 
tions can be LOCKed. The 386 DX can’t require that 
all pages holding the string be physically present in 
memory. Hence, a Page Fault (exception 14) might 
have to be taken during the repeated string instruc- 
tion. Therefore the LOCK prefix can’t be supported 
during repeated string instructions. 


These are the only instruction forms where the 
LOCK prefix is legal on the 386 DX: 


_ Operands 


BIT Test and 
SET/RESET/COMPLEMENT 


Mem, Reg/immed 


Reg, Mem 
Mem, Reg 
Mem, Reg/immed 


ADD, OR, ADC, SBB, 
AND, SUB, XOR 
NOT, NEG, INC, DEC 


Mem 


An exception 6 will be generated if a LOCK prefix is 
placed before any instruction form or opcode not 
listed above. The LOCK prefix allows indivisible 


SELECTED 
SEGMENT 
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read/modify/write operations on memory operands 
using the instructions above. For example, even the 
ADD Reg, Mem is not LOCKable, because the Mem 
operand is not the destination (and therefore no 
memory read/modify/operation is being performed). 


Since, on the 386 DX, repeated string instructions 
are not LOCKable, it is not possible to LOCK the bus 
for a long period of time. Therefore, the LOCK prefix 
is not IOPL-sensitive on the 386 DX. The LOCK pre- 
fix can be used at any privilege level, but ony on the 
instruction forms listed above. 


3.2 MEMORY ADDRESSING 


In Real Mode the maximum memory size is limited to 
1 megabyte. Thus, only address lines A2-A19 are 
active. (Exception, the high address lines A20-A31 


are high during CS-relative memory cycles until an 


intersegment jump or call is executed (see section 
2.10)). 


Since paging is not allowed in Real Mode the linear 
addresses are the same as physical addresses. 
Physical addresses are formed in Real Mode by 
adding the contents of the appropriate segment reg- 
ister which is shifted left by four bits to an effective 
address. This addition results in a physical address 
from OOOOO000H to 0010FFEFH. This is compatible 
with 80286 Real Mode. Since segment registers are 
shifted left by 4 bits this implies that Real Mode seg- 
ments always start on 16 byte boundaries. | 


All segments in Real Mode are exactly 64K bytes - 
long, and may be read, written, or executed. The 386 
DX will generate an exception 13 if a data operand 
or instruction fetch occurs past the end of a seg- 
ment. (i.e. if an operand has an offset greater than 
FFFFH, for example a word with a low byte at 
FFFFH and the high byte at OOOOH.) 
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Segments may be overlapped in Real Mode. Thus, if 
a particular segment does not use all 64K bytes an- 
other segment can be overlayed on top of the un- 
used portion of the previous segment. This allows 
the programmer to minimize the amount of physical 
memory needed for a program. 


3.3 RESERVED LOCATIONS 


There are two fixed areas in memory which are re- 
served in Real address mode: system initialization 
area and the interrupt table area. Locations OOOOOH 
through OO3FFH are reserved for interrupt vectors. 
Each one of the 256 possible interrupts has a 4-byte 
jump vector reserved for it. Locations FFFFFFFOH 
through FFFFFFFFH are reserved for system initiali- 
zation. 


3.4 INTERRUPTS 


Many of the exceptions shown in Table 2-5 and dis- 
cussed in section 2.9 are not applicable to Real 
Mode operation, in particular exceptions 10, 11, 14, 
will not happen in Real Mode. Other exceptions 
have slightly different meanings in Real Mode; Table 
3-1 identifies these exceptions. 


3.5 SHUTDOWN AND HALT 


The HLT instruction stops program execution and - 


prevents the processor from using the local bus until 
restarted. Either NMI, INTR with interrupts enabled 
(IF = 1), or RESET will force the 386 DX out of halt. If 
interrupted, the saved CS:IP will point to the next 
instruction after the HLT. 


Shutdown will occur when a severe error is detected 
that prevents further processing. In Real Mode, 
shutdown can occur under two conditions: 


An interrupt or an exception occur (Exceptions 8 
or 13) and the interrupt vector is larger than the 
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Interrupt Descriptor Table (i.e. There is not an in- 
terrupt handler for the interrupt). 


A CALL, INT or PUSH instruction attempts to wrap 
around the stack segment when SP is not even. 
(e.g. pushing a value on the stack when SP = 
0001 resulting in a stack segment greater than 
FFFFH) 


An NMI input can bring the processor out of shut- 


- down if the Interrupt Descriptor Table limit is large 


enough to contain the NMI interrupt vector (at least 
0017H) and the stack has enough room to contain 
the vector and flag information (i.e. SP is greater 
than 0005H). Otherwise shutdown can only be exit- 
ed via the RESET input. 


4. PROTECTED MODE 
ARCHITECTURE 


4.1 INTRODUCTION 


The complete capabilities of the 386 DX are un- 
locked when the processor operates in Protected 
Virtual Address Mode (Protected Mode). Protected 
Mode vastly increases the linear address space to 
four gigabytes (232 bytes) and allows the running of 
virtual memory programs of almost unlimited size 
(64 terabytes or 246 bytes). In addition Protected 
Mode allows the 386 DX to run all of the existing 
8086 and 80286 software, while providing a sophisti- 
cated memory management and a hardware-assist- 
ed protection mechanism. Protected Mode allows 
the use of additional instructions especially opti- 
mized for supporting multitasking operating systems. 
The base architecture of the 386 DX remains the 
same, the registers, instructions, and addressing 
modes described in the previous sections are re- 
tained. The main difference between Protected 
Mode, and Real Mode from a programmer’s view is 
the increased address space, and a different ad- 
dressing mechanism. 


| Table 3-1 


Interrupt table limit too small ro | 


CS, DS, ES, FS, GS 13 
Segment overrun exception 


SS Segment overrun exception 


interrupt Related Return 
Number instructions Address Location 


INT Vector is not 


within table limit 


Word memory reference 
beyond offset = FFFFH. 

An attempt to execute 

past the end of CS segment. | 


Stack Reference 
beyond offset = FFFFH 


Before 
Instruction 


Before 
Instruction 


Before 
Instruction 
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4.2 ADDRESSING MECHANISM 


Like Real Mode, Protected Mode uses two compo- 
nents to form the logical address, a 16-bit selector is 
used to determine the linear base address of a seg- 
ment, the hase addrass is added to a 22-hit offective 
address to form a 32-bit linear address. The linear 
address is then either used as the 32-bit physical 
address, or if paging is enabled the paging mecha- 
nism maps the 32-bit linear address into a 32-bit 
physical address. | | 


The difference between the two modes lies in calcu- 
lating the base address. In Protected Mode the se- 
lector is used to specify an index into an operating 


_ 48/32 BIT POINTER 


SELECTOR | OFFSET | 


| ACCESS RIGHTS | 


| Limit i | 
| BASE ADDRESS -- 


SEGMENT 
DESCRIPTOR 


48 BIT POINTER . 
SEGMENT OFFSET 


386" Dx CPU | 
PAGING 
MECHANISM 


ACCESS RIGHTS 
LIMIT 
4 BASE ADDRESS 
SEGMENT 
DESCRIPTOR 


LINEAR 
ADDRESS 
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system defined table (see Figure 4-1). The table 


contains the 32-bit base address of a given seg- 
ment. The physical address is formed by adding the 
base address obtained from the table to the offset. 


Paging nrovides an additional memory management 
mechanism which operates only in Protected Mode. 
Paging provides a means of managing the very large 
segments of the 386 DX. As such, paging operates 
beneath segmentation. The paging mechanism — 
translates the protected linear address which comes | 
from the segmentation unit into a physical address. 
Figure 4-2 shows the complete 386 DX addressing 
mechanism with paging enabled. 


SEGMENT. LIMIT 


MEMORY OPERAND | 


SELECTED 
SEGMENT 
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PHYSICAL ADDRESS 


4K BYTES 
4K BYTES 


| 4K BYTES 
PHYSICAL 


ADDRESS 


| PAGE FRAME 


| ADDRESS AK BYTES 


4K BYTES 


MEMORY OPERAND en PAGE: 
| AK BYTES 


a ea ee 
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Figure 4-2. Paging and Segmentation 
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4.3 SEGMENTATION 


4.3.1 Segmentation Introduction 


Segmentation is one method of memory manage- 
ment. Segmentation provides the basis for protec- 
tion. Segments are used to encapsulate regions of 
memory which have common attributes. For exam- 
ple, all of the code of a given program could be con- 
tained in a segment, or an operating system table 
may reside in a segment. All information about a 
segment is stored in an 8 byte data structure called 
a descriptor. All of the descriptors in a system are 
contained in tables recognized by hardware. 


4.3.2 Terminology 


The following terms are used throughout the discus- 
sion of descriptors, privilege levels and protection: 


PL: Privilege Level—One of the four hierarchical 
privilege levels. Level 0 is the most privileged level 
and level 3 is the least privileged. More privileged 
levels are numerically smaller than less privileged 
levels. 


RPL: Requestor Privilege Level—The privilege level 
of the original supplier of the selector. RPL is deter- 
mined by the least two significant bits of a selector. 


DPL: Descriptor Privilege Level—This is the least 
privileged level at which a task may access that de- 
scriptor (and the segment associated with that de- 
scriptor). Descriptor Privilege Level is determined by 
bits 6:5 in the Access Right Byte of a descriptor. 


CPL: Current Privilege Level—The privilege level at 
which a task is currently executing, which equals the 
privilege level of the code segment being executed. 
CPL can also be determined by examining the low- 
est 2 bits of the CS register, except for conforming 
code segments. . 


EPL: Effective Privilege Level—The effective privi- 


lege level is the least privileged of the RPL and DPL. 
Since smaller privilege level values indicate greater 
privilege, EPL is the numerical maximum of RPL and 
DPL. 


Task: One instance of the execution of a program. 
_ Tasks are also referred to as processes. 
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4.3.3 Descriptor Tables 


4.3.3.1 DESCRIPTOR TABLES INTRODUCTION 


The descriptor tables define all of the segments 
which are used in an 386 DX system. There are 
three types of tables on the 386 DX which hold de- 
scriptors: the Global Descriptor Table, Local De- 
scriptor Table, and the Interrupt Descriptor Table. All 
of the tables are variable length memory arrays. 
They can range in size between 8 bytes and 64K 
bytes. Each table can hold up to 8192 8 byte de- 
scriptors. The upper 13 bits of a selector are used as 
an index into the descriptor table. The tables have 
registers associated with them which hold the 32-bit 
linear base address, and the 16-bit limit of each ta- 
ble. 


Each of the tables has a register associated with it 
the GDTR, LDTR, and the IDTR (see Figure 4-3). 
The LGDT, LLDT, and LIDT instructions, load the 
base and limit of the Global, Local, and Interrupt De- 
scriptor Tables, respectively, into the appropriate 
register. The SGDT, SLDT, and SIDT instructions 
store the base and limit values. These tables are 
manipulated by the operating system. Therefore, the 
load descriptor table instructions are privileged in- 
structions. . 


4.3.3.2 GLOBAL DESCRIPTOR TABLE 


The Global Descriptor Table (GDT) contains de- 
scriptors which are possibly available to all of the 
tasks in a system. The GDT can contain any type of 
segment descriptor except for descriptors which are 
used for servicing interrupts (i.e. interrupt and trap 
descriptors). Every 386 DX system contains a 


15 .¢) 


LDT DESCR 
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¢ PROGRAM INVISIBLE 
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Figure 4-3. Descriptor Table Registers 
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GDT. Generally the GDT contains code and data 
segments used by the operating systems and task 
state segments, and descriptors for the LDTs in a 
system. _ | 


The first slot of the Glohal Descrintor Table corre- 
sponds to the null selector and is not used. The null 
_ selector defines a null pointer value. 


4.3.3.3 LOCAL DESCRIPTOR TABLE 


LDTs contain descriptors which are associated with 
a given task. Generally, operating systems are de- 
signed so that each task has a separate LDT. The 
LDT may contain only code, data, stack, task gate, 
and call gate descriptors. LDTs provide a mecha- 
nism for isolating a given task’s code and data seg- 
ments from the rest of the operating system, while 
the GDT contains descriptors for segments which 
are common to all tasks. A segment cannot be ac- 
cessed by a task if its segment descriptor does not 
exist in either the current LDT or the GDT. This pro- 


vides both isolation and protection for a task’s seg- | 


ments, while still allowing global data to be shared 
among tasks. 


Unlike the 6 byte GDT or IDT registers which contain 
a base address and limit, the visible portion of the 
LDT register contains only a 16-bit selector. This se- 


lector refers to a Local Descriptor Table descriptor in 


_ the GDT. 


4.3.3.4 INTERRUPT DESCRIPTOR TABLE 


The third table needed for 386 DX systems is the 
Interrupt Descriptor Table. (See Figure 4-4.) The IDT 


contains the descriptors which point to the location - 


of up to 256 interrupt service routines. The IDT 


SEGMENT BASE 15...0 


Base Address of the segment 

The length of the segment 

Present Bit 1=Present O=Not Present 
Descriptor Privilege Level 0-3 
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may contain only task gates, interrupt gates, and 
trap gates. The IDT should be at least 256 bytes in 
size in order to hold the descriptors for the 32 Intel 
Reserved Interrupts. Every interrupt used by a sys- 
tem must have an entry in the IDT. The IDT entries 
are referenced via INT instructions, external inter- - 
rupt vectors, and exceptions. (See 2.9 Interrupts). 
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Figure 4-4. Interrupt Descriptor 
Table Register Use 


4.3.4 Descriptors | 
4.3.4.1 DESCRIPTOR ATTRIBUTE BITS 


The object to which the segment selector points to 
is called a descriptor. Descriptors are eight byte 
quantities which contain attributes about a given re- . 
gion of linear address space (i.e. a segment). These 
attributes include the 32-bit base linear address of 
the segment, the 20-bit length and granularity of the 
segment, the protection level, read, write or execute 
privileges, the default size of the operands (16-bit or 


BYTE 
ADDRESS 


0 


SEGMENT LIMIT 15. 


Segment Descriptor O=System Descriptor 1=Code or Data Segment Descriptor 


Type of Segment 
Accessed Bit 
Granularity Bit 1=Segment length is page granular 


NOTE: 


0= Segment length is byte granular 
Default Operation Size (recognized in code segment descriptors only) 1=32-bit segment 0= 16-bit segment 
Bit must be zero (0) for compatibility with future processors 

AVL Available field for user or OS 


‘S 


In a maximum-size segment (ie. a segment with G= 1 and segment limit 19...0=FFFFFH), the lowest 12 bits of the 
segment base should be zero (ie. segment base 11...000 = 000K). 


Figure 4-5. Segment Descriptors | 
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32-bit), and the type of segment. All of the attribute 
information about a segment is contained in 12 bits 
in the segment descriptor. Figure 4-5 shows the gen- 
eral format of a descriptor. All segments on the 386 
DX have three attribute fields in common: the P bit, 
the DPL bit, and the S$ bit. The Present P bit is 1 if 
the segment is loaded in physical memory, if P=0 
then any attempt to access this segment causes a 
not present exception (exception 11). The Descrip- 
tor Privilege Level DPL is a two-bit field which speci- 
fies the protection level O—3 associated with a seg- 
ment. 


The 386 DX has two main categories of segments 
system segments and non-system segments (for 


BASE 31... 24 


D/B  1=Default instructions Attributes are 32-Bits 
0= Default Instruction Attributes are 16-Bits 
AVL Available field for user or OS 


NOTE: 
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code and data). The segment S bit in the segment 
descriptor determines if a given segment is a system 
segment or a code or data segment. If the S bit is 1 
then the segment is either a code or data segment, if 
it is O then the segment is a system segment. 


4.3.4.2 386™ DX CODE, DATA DESCRIPTORS 
(S=1) 


Figure 4-6 shows the general format of a code and 
data descriptor and Table 4-1 illustrates how the bits 
in the Access Rights Byte are interpreted. 


ACCESS 
RIGHTS 
BYTE 
G Granularity Bit 1=Segment length is page granular 


0= Segment length is byte granular 
Bit must be zero (0) for compatibility with future processors 


In a maximum-size segment (ie. a segment with G=1 and segment limit 19...0=FFFFFH), the lowest 12 bits of the 
segment base should be zero (ie. segment base 11...000 = 000H). | 


Figure 4-6. Segment Descriptors 


Table 4-1. Access Rights Byte Definition for Code and Data Descriptions 


Present (P) iP = 1 


Segment is mapped into physical memory. 


P=0 Nomapping to physical memory exits, base and limit are 
not used. 


Descriptor Privilege 
Level (DPL) 
Segment Descrip- 
tor (S) 
Executable (E) 
Expansion Direc-. 
tion (ED) 
Writeable (W) 


Executable (E) 
Conforming (C) 


Readable (R) 


Accessed (A) 


Code segment may only be executed 
when CPL = DPL and CPL 

remains unchanged. 

Code segment may not be read. 
Code segment may be read. 


Segment privilege attribute used in privilege tests. 


Code or Data (includes stacks) segment descriptor _ 
System Segment Descriptor or Gate Descriptor _ 


Descriptor type is data segment: if 

0 Expand up segment, offsets must be < limit. Data 
Expand down segment, offsets must be > limit. 
Data segment may not be written into. 
Data segment may be written into. 


Segment 
(S = 1, 
E = 0) 


Descriptor type is code segment: | If 


Code 
Segment 
(S x 1, 
E=1) 


Segment has not been accessed. 


Segment selector has been loaded into segment register 
or used by selector test instructions. 
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Code and data segments have several descriptor 
fields in common. The accessed A bit is set whenev- 
er the processor accesses a descriptor. The A bit is 
used by operating systems to keep usage statistics 
on a given segment. The G bit, or granularity bit, 
specifies if a segment length is byte-granular or 
page-granular. 386 DX segments can be one mega- 
byte long with byte granularity (G=0) or four giga- 
bytes with page granularity (G= 1), (i.e., 229 pages 
each page is 4K bytes in length). The granularity is 
totally unrelated to paging. A 386 DX system can 
consist of segments with byte granularity, and page 
granularity, whether or not paging is enabled. 


The executable E bit tells if a segment is a code or 
data segment. A code segment (E= 1, S= 1) may be 
execute-only or execute/read as determined by the 
Read R bit. Code segments are execute only if 
R=0, and execute/read if R=1. Code segments 
may never be written into. 


: | NOTE: : 

Code segments may be modified via aliases. Alias- 
es are writeable data segments which occupy the 
same range of linear address space as the code 
segment. | 


The D bit indicates the default length for operands 
and effective addresses. lf D=1 then 32-bit oper- 
-ands and 32-bit addressing modes are assumed. If 
D=0 then 16-bit operands and 16-bit addressing 
modes are assumed. Therefore all existing 80286 


code segments will execute on the 386 DX assum- | 


ing the D bit is set 0. 


Another attribute of code segments is determined by 
the conforming C bit. Conforming segments, C= 1, 


can be executed and shared by programs at differ- — 


ent privilege levels. (See section 4.4 Protection.) 
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Segments identified as data segments (E=0, S= 1) 
are used for two types of 386 DX segments: stack 
and data segments. The expansion direction (ED) bit 
specifies if a segment expands downward (stack) or 
upward (data). If a segment is a stack segment all 
offsets must be greater than the seament limit. On a 
data segment all offsets must be less than or equal 
to the limit. In other words, stack segments start at 
the base linear address plus the maximum segment 
limit and grow down to the base linear address plus 
the limit. On the other hand, data segments start at 
the base linear address and expand to the base lin- 
ear address plus limit. 


The write W bit controls the ability to write into a 
segment. Data segments are read-only if W=0. The 
stack segment must have W=1. 


The B bit controls the size of the stack pointer regis- 
ter. If B= 1, then PUSHes, POPs, and CALLs all use 


the 32-bit ESP register for stack references and as- 


sume an upper limit of FFFFFFFFH. lf B=0, stack 
instructions all use the 16-bit SP register and as- 
sume an upper limit of FFFFH. 


4.3.4.3 SYSTEM DESCRIPTOR FORMATS 


System segments describe information. about oper- 
ating system tables, tasks, and gates. Figure 4-7 
shows the general format of system segment de- 
scriptors, and the various types of system segments. 
386 DX system descriptors contain a 32-bit base lin-. 


- ear address and a 20-bit segment limit. 80286 sys- 


tem descriptors have a 24-bit base address and a 
16-bit segment limit. 80286 system descriptors are 
identified by the upper 16 bits being all zero. 


SEGMENT BASE 15. SEGMENT LIMIT 15. 


= So000r = AGES re [i 


Defines 


invalid 
. Available 80286 TSS 
LDT 
Busy 80286 TSS 
80286 Call Gate 
Task Gate (for 80286 or 386™ DX on 
80286 Interrupt Gate 
80286 Trap Gate 


=f 

< 

TS 
@ 


0 
{ 
2 
3 
3 
5 
6 
7 


: NOTE: 


Defines 


Invalid 

Available 386™ DX TSS 
Undefined (Intel Reserved) 
Busy 386™ DX TSS 
386™ DX Call Gate 
Undefined (Intel Reserved) 
3867 DxX interrupt Gate 
386™ DX Trap Gate 


In a maximum-size segment (ie. a segment with G=1 and segment limit 19.. O= FFFFFH), the lowest 12 Ble of the 
segment base should be zero (ie. segment base 11...000=000H). 


Figure 4-7. System Segments Descriptors 
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4.3.4.4 LDT DESCRIPTORS (S=0, TYPE = 2) 


LDT descriptors (S=0 TYPE=2) contain informa- 
tion about Local Descriptor Tables. LDTs contain a 
table of segment descriptors, unique to a particular 
task. Since the instruction to load the LDTR is only 
available at privilege level 0, the DPL field is ignored. 
LDT descriptors are only allowed in the Global De- 
scriptor Table (GDT). 


4.3.4.5 TSS DESCRIPTORS (S=0, 
TYPE = 1, 3, 9, B) 


A Task State Segment (TSS) descriptor contains in- 
formation about the location, size, and privilege level 
of a Task State Segment (TSS). A TSS in turn is a 
special fixed format segment which contains all the 
state information for a task and a linkage field to 
permit nesting tasks. The TYPE field is used to indi- 
cate whether the task is currently BUSY (i.e. on a 
chain of active tasks) or the TSS is available. The 
TYPE field also indicates if the segment contains a 
80286 or a 386 DX TSS. The Task Register (TR) 
contains the selector which points to the current 
Task State Segment. 


4.3.4.6 GATE DESCRIPTORS (S=0, 
TYPE = 4-7, C, F) 


Gates are used to control access to entry points 
within the target code segment. The various types of 


SELECTOR a is—sCd 
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gate descriptors are call gates, task gates, 
interrupt gates, and trap gates. Gates provide a 
level of indirection between the source and destina- 
tion of the control transfer. This indirection allows 
the processor to automatically perform protection 
checks. It also allows system designers to control 
entry points to the operating system. Cail gates are 
used to change privilege levels (see section 4.4 
Protection), task gates are used to perform a task 
switch, and interrupt and trap gates are used to 
specify interrupt service routines. 


Figure 4-8 shows the format of the four types of gate 
descriptors. Call gates are primarily used to transfer 
program control to a more privileged level. The call 
gate descriptor consists of three fields: the access 
byte, a long pointer (selector and offset) which 
points to the start of a routine and a word count 
which specifies how many parameters are to be cop- 
ied from the caller’s stack to the stack of the called 
routine. The word count field is only used by call 
gates when there is a change in the privilege level, 
other types of gates ignore the word count field. 


Interrupt and trap gates use the destination selector 
and destination offset fields of the gate descriptor as 
a pointer to the start of the interrupt or trap handler 
routines. The difference between interrupt gates and 
trap gates is that the interrupt gate disables inter- 
rupts (resets the IF bit) while the trap gate does not. 


OFFSET 15. 


| WORD 
|OFFSET 31. TYPE ont +4 


Gate Descriptor Fields 


< 
2 
e 
© 


80286 call gate 


80286 interrupt gate 
80286 trap gate 
386™ DX call gate 


386™ DX trap gate 
P 


4 
5 
6 
7 
C 
E 
F 
0 
1 


Description 


3867 DX interrupt gate 


Task gate (for 80286 or 386™ DX task) 


Descriptor contents are not valid 
Descriptor contents are valid 


DPL—least privileged level at which a task may access the gate. WORD COUNT 0-—31—the number of parameters to copy from caller’s stack 
to the called procedure’s stack. The parameters are 32-bit quantities for 386™ DX gates, and 16-bit quantities for 80286 gates. 


DESTINATION 
SELECTOR 


16-bit 
selector 


DESTINATION 
OFFSET 


offset 
16-bit 80286 
32-bit 386™ DX 


Selector to the target code segment 
or 
Selector to the target task state segment for task gate 


Entry point within the target code segment 


Figure 4-8. Gate Descriptor Formats — 
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Task gates are used to switch tasks. Task gates 
may only refer to a task state segment (see section 
4.4.6 Task Switching) therefore only the destination 
selector portion of a task gate descriptor is used, 
and the destination offset is ignored. 


Exception 13 is generated when a destination selec- 
tor does not refer to a correct descriptor type, i.e., a 
code segment for an interrupt, trap or call gate, a 
TSS for a task gate. © a | 


The access byte format is the same for all gate de- 
scriptors. P=1 indicates that the gate contents are 
valid. P=0 indicates the contents are not valid and 


causes exception 11 if referenced. DPL is the de-. 


-scriptor privilege level and specifies when this de- 
scriptor may be used by a task (see section 4.4 
Protection). The S field, bit 4 of the access rights 
byte, must be 0 to indicate a system control descrip- 
tor. The type field specifies the descriptor type as 
indicated in Figure 4-8. | 


4.3.4.7 DIFFERENCES BETWEEN 386™ DX AND 
80286 DESCRIPTORS 


In order to provide operating system compatibility 
between the 80286 and 386 DX, the 386 DX sup- 
ports all of the 80286 segment descriptors. Figure 
4-9 shows the general format of an 80286 system 


segment descriptor. The only differences between — 


80286 and 386 DX descriptor formats are that the 
values of the type fields, and the limit and base ad- 
dress fields have been expanded for the 386 DX. 
The 80286 system segment descriptors contained a 
24-bit base address and 16-bit limit, while the 386 
DX system segment descriptors have a 32-bit base 
address, a 20-bit limit field, and a granularity bit. 


By supporting 80286 system segments the 386 DX 
is able to execute 80286 application programs on a 
386 DX operating system. This is possible because 
the processor automatically understands which de- 
scriptors are 80286-style descriptors and which de- 


SEGMENT LIMIT 15...0 


BASE Base Address of the segment | 
LIMIT The length of the segment 
P Present Bit 1=Present O=Not Present 


! 
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scriptors are 386 DxX-style descriptors. In particular, 
if the upper word of a descriptor is zero, then that 
descriptor is a 80286-style descriptor. | 


The only other differences between 80286-style de- 
scriptors and 386 DX descriptors is the internretation 
of the word count field of call gates and the B bit. 
The word count field specifies the number of 16-bit 
quantities to copy for 80286 call gates and 32-bit 
quantities for 386 DX call gates. The B bit controls 
the size of PUSHes when using a call gate; if B=0 
PUSHes are 16 bits, if B= 1 PUSHes are 32 bits. 


4.3.4.8 SELECTOR FIELDS 


A selector in. Protected Mode has three fields: Local 
or Global Descriptor Table Indicator (Tl), Descriptor 
Entry Index (Index), and Requestor (the selector’s) 
Privilege Level (RPL) as shown in Figure 4-10. The 
Tl bits select one of two memory-based tables of 
descriptors (the Global Descriptor Table or the Local 
Descriptor Table). The Index selects one of 8K de- 
scriptors in the. appropriate descriptor table. The — 
RPL bits allow high speed testing of the selector’s 
privilege attributes. 


4.3.4.9 SEGMENT DESCRIPTOR CACHE © 


In addition to the selector value, every segment reg- 
ister has a segment descriptor cache register asso- 
ciated with it. Whenever a segment register’s con- 
tents are changed, the 8-byte descriptor associated 
with that selector is automatically loaded (cached) 
on the chip. Once loaded, all references to that seg- 
ment use the cached descriptor information instead 
of reaccessing the descriptor. The contents of the 


_ descriptor cache are not visible to the programmer. 


Since descriptor caches only change when a seg- 
ment register is changed, programs which modify 
the descriptor tables must reload the appropriate 
segment registers after changing a descriptor’s 
value. 


0 


BASE 


Descriptor Privilege Level 0-3 
Ss System Descriptor O=System 1=User 
TYPE Type of Segment 


Figure 4-9. 80286 Code and Data Segment Descriptors 
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SELECTOR 


43 2 


SEGMENT 
REGISTER 


DESCRIPTOR 
NUMBER 


GLOBAL 
DESCRIPTOR DESCRIPTOR 
TABLE TABLE 


Figure 4-10. Example Descriptor Selection 


5-327 


231630-—59 


4.3.4.10 SEGMENT DESCRIPTOR REGISTER 
SETTINGS | 


The contents of the segment descriptor cache vary 
depending on the mode the 386 DX is operating in. 
When operating in Real Address Made, the seament 
base, limit, and other attributes within the segment 
cache registers are defined as shown in Figure 4-11. 


SEGMENT 


_ 32=BIT BASE 
(UPDATED DURING SELECTOR 
LOAD INTO SEGMENT REGISTER) 


CONFORMING PRIVILEGE 
STACK SIZE 
EXECUTABLE 
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For compatiblity with the 8086 architecture, the base 
is set to sixteen times the current selector value, the 
limit is fixed at OOOOFFFFH, and the attributes are 
fixed so as to indicate the segment is present and 
fully usable. In Real Address Mode, the internal 
“nrivilage level” is always fixed to the hichest level, 
level 0, so |/O and other privileged opcodes may be 
executed. 


DESCRIPTOR CACHE REGISTER CONTENTS 


32 = BIT LIMIT 
(FIXED) 


OTHER ATTRIBUTES 
(FIXED) 


y 
16X CURRENT SS SELECTOR OOOOFFFFH 
16X CURRENT DS SELECTOR OOOOFFFFH Y | 


| 16X CURRENT ES SELECTOR ~ OOOOFFFFH |Y 


OOOOFFFFH © 


16X CURRENT FS SELECTOR | 


16X CURRENT GS SELECTOR 
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— *Except the 32-bit CS base is initialized to FFFFFOOOH after reset until first intersegment control transfer (e.g. intersegment CALL, or 

intersegment JMP, or INT). (See Figure 4-13 Example.) 
Key: Y yes 

no 

privilege level 0 

privilege level 1 

privilege level 2 

privilege level 3 

expand up 


expand down 

byte granularity 

page granularity 

push/pop 16-bit words 

push/pop 32-bit dwords 

does not apply to that segment cache register 


tone do de dott 


it uot ue ed 


Figure 4-11. Segment Descriptor Caches for Real Address Mode 
(Segment Limit and Attributes are Fixed) 
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When operating in Protected Mode, the segment according to the contents of the segment descriptor 
base, limit, and other attributes within the segment indexed by the selector value loaded into the seg- 
cache registers are defined as shown in Figure 4-12. ment register. 

In Protected Mode, each of these fields are defined 


SEGMENT DESCRIPTOR CACHE REGISTER CONTENTS 
32 =- BIT BASE 32 — BIT LIMIT OTHER ATTRIBUTES 


(UPDATED DURING (UPDATED DURING (UPDATED DURING 
SELECTOR LOAD INTO SELECTOR LOAD INTO SELECTOR LOAD INTO 
SEGMENT REGISTER) SEGMENT REGISTER) SEGMENT REGISTER) 


CONFORMING PRIVILEGE 


WRITEABLE 

READABLE 

EXPANSION DIRECTION 
GRANULARITY 
ACCESSED 

PRIVILEGE LEVEL 


BASE PER SEG DESCR LIMIT PER SEG DESCR 


BASE PER SEG DESCR LIMIT PER SEG DESCR 
BASE PER SEG DESCR LIMIT PER SEG DESCR 
BASE PER SEG DESCR LIMIT PER SEG DESCR 


BASE PER SEG DESCR LIMIT PER SEG DESCR 
231630-61 


= fixed yes 

= fi 

= per segment descriptor 

= per segment descriptor; descriptor must indicate “‘present” to avoid exception 11 

(exception 12 in case of SS) 

r = per segment descriptor, but descriptor must indicate “readable” to avoid exception 13 
(special case for SS) 

w = per segment descriptor, but descriptor must indicate “writable’” to avoid exception 13 
(special case for SS) 

- = does not apply to that segment cache register 


Y 
N 
d 
p 


Figure 4-12. Segment Descriptor Caches for Protected Mode (Loaded per Descriptor) 
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When. operating in a Virtual 8086 Mode within the 
Protected Mode, the segment base, limit, and other 
attributes within the segment cache registers are de- 
fined as shown in Figure 4-13. For compatibility with 
the. 8086 architecture, the base is set to sixteen 
times the current selector value, the limit is fixed at 


OOOOFFFFH, and the attributes are fixed so as to 
indicate the segment is present and fully usable. The 
virtual program executes at lowest privilege level, 
level 3, to allow trapping of all |OPL-sensitive in- 
structions and level-O-only instructions. 


SEGMENT DESCRIPTOR CACHE REGISTER CONTENTS 
32-BIT LIMIT =~ OTHER ATTRIBUTES 
(FIXED) (FIXED) 


32 - BIT BASE 


(UPDATED DURING SELECTOR 
LOAD INTO SEGMENT REGISTER) 


CONFORMING PRIVILEGE 
STACK SIZE 
EXECUTABLE 
WRITEABLE 

READABLE 

EXPANSION DIRECTION: 
GRANULARITY 
ACCESSED 

PRIVILEGE LEVEL 


privilege level 0 
privilege level 1 
privilege level 2 
privilege level 3 
expand up 


to tf t te ted 


COn=+-FO2 < 
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byte granularity 

page granularity 

push/pop 16-bit words 

push/pop 32-bit dwords 

does not apply to that segment cache register 


Hou wd wou 


Figure 4-13. Segment Descriptor Caches for Virtual 8086 Mode within Protected Mode 
(Segment Limit and Attributes are Fixed) 
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4.4 PROTECTION 
4.4.1 Protection Concepts 


APPLICATIONS 
CPU 
ENFORCED 
SOFTWARE 
INTERFACES 


OS EXTENSIONS 


HIGH SPEED 
OPERATING 
SYSTEM 
INTERFACE 


231630-63 
Figure 4-14. Four-Level Hierachical Protection 


The 386 DX has four levels of protection which are 
optimized to support the needs of a multi-tasking op- 
erating system to isolate and protect user programs 
from each other and the operating system. The privi- 
lege levels control the use of privileged instructions, 
1/O instructions, and access to segments and seg- 
ment descriptors. Unlike traditional microprocessor- 
based systems where this protection is achieved 
only through the use of complex external hardware 
and software the 386 DX provides the protection as 
part of its integrated Memory Management Unit. The 
386 DX offers an additional type of protection on a 
page basis, when paging is enabled (See section 
4.5.3 Page Level Protection). 


The four-level hierarchical privilege system is illus- 
trated in Figure 4-14. It is an extension of the user/ 
supervisor privilege mode commonly used by mini- 
computers and, in fact, the user/supervisor mode is 
fully supported by the 386 DX paging mechanism. 
The privilege levels (PL) are numbered 0 through 3. 
Level 0 is the most privileged or trusted level. 


4.4.2 Rules of Privilege 


The 386 DX controls access to both data and proce- 
dures between levels of a task, according to the fol- 
lowing rules. . 


e Data stored in a segment with privilege level p can 
be accessed only by code executing at a privilege 
level at least as privileged as p. 


e A code segment/procedure with privilege level p 
can only be called by a task executing at the same 
or a lesser privilege level than p. — 
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4.4.3 Privilege Levels 


4.4.3.1 TASK PRIVILEGE 


At any point in time, a task on the 386 DX always 
executes at one of the four privilege levels. The Cur-. 
rent Privilege Level (CPL) specifies the task’s privi- 
lege level. A task’s CPL may only be changed by 
control transfers through gate descriptors to a code 
segment with a different privilege level. (See section 
4.4.4 Privilege Level Transfers) Thus, an applica- 
tion program running at PL = 3 may call an operat- 
ing system routine at PL = 1 (via a gate) which 
would cause the task’s CPL to be set to 1 until the 


- operating system routine was finished. 


4.4.3.2 SELECTOR PRIVILEGE (RPL) 


The privilege level of a selector is specified by the 
RPL field. The RPL is the two least significant bits of 
the selector. The selector’s RPL is only used to es- 
tablish a less trusted privilege level than the current 
privilege level for the use of a segment. This level is 
called the task’s effective privilege level (EPL). The 
EPL is defined as being the least privileged (i.e. nu- 
merically larger) level of a task’s CPL and a selec- 
tor’s RPL. Thus, if selector’s RPL = 0 then the CPL 
always specifies the privilege level for making an ac- 
cess using the selector. On the other hand if RPL = 
3 then a selector can only access segments at level 
3 regardless of the task’s CPL. The RPL is most 
commonly used to verify that pointers passed to an 
operating system procedure do not access data that 
is of higher privilege than the procedure that origi- 
nated the pointer. Since the originator of a selector 
can specify any RPL value, the Adjust RPL (ARPL) 
instruction is provided to force the RPL bits to the 
originator’s CPL. 


4.4.3.3 1/O PRIVILEGE AND I/O PERMISSION 
BITMAP . | 


The |/O privilege level (IOPL, a 2-bit field in the 
EFLAG register) defines the least privileged level at 
which I/O instructions can be unconditionally per- 
formed. I/O instructions can be unconditionally per- 
formed when CPL < IOPL. (The I/O instructions are 
IN, OUT, INS, OUTS, REP INS, and REP OUTS.) 
When CPL > IOPL, and the current task is associat- 
ed with a 286 TSS, attempted |/O instructions cause 
an exception 13 fault. When CPL > IOPL, and the 
current task is associated with a 386 DX TSS, the 
I/O Permission Bitmap (part of a 386 DX TSS) is 
consulted on whether I/O to the port is allowed, or 
an exception 13 fault is to be generated instead. For 


intel 


diagrams of the |/O Permission Bitmap, refer to Fig- 
ures 4-15a and 4-15b. For further information on 
how the I/O Permission Bitmap is used in Protected 
Mode or in Virtual 8086 Mode, refer to section 4.6.4 
Protection and I/O Permission Bitmap. 
The /O piiviiege ievel (iOPL) aiso aifecis wneiner 
_ several other instructions can be executed or cause 
an exception 13 fault instead. These instructions are 
called ‘IOPL-sensitive” instructions and they are 
CLI and STI. (Note that the LOCK Bien) is not |OPL- 
sensitive on the. 386 DX.) 


The IOPL also affects whether the IF (interrupts en- 
able flag) bit can be changed by loading a value into 
the EFLAGS register. When CPL < IOPL, then the 
IF bit can be changed by loading a new value into 
the EFLAGS register. When CPL > IOPL, the IF bit 
cannot be changed by a new value POP’ed into (or 
otherwise loaded into) the EFLAGS register; the IF 
bit merely remains unchanged and no exception is 
generated. 


tee ae Pointer Test Instructions - 


Sperone Adjust Requested Privi- 
Register | lege Level: adjusts the 


ARPL 
RPL of the selector to the 


numeric maximum of 
VERR Selector 


current selector RPL value 
Selector 


and the RPL value in the 
VERW 


register. Set zero flag if 
selector RPL was 
| Register, 
Selector 


changed. 


VERify for Read: sets the 
zero flag if the segment — 
referred to by the selector 
can be read. 


VERify for Write: sets the 
zero flag if the segment 
referred to by the selector 
can be written. 


Load Segment Limit: reads 
the segment limit into the 
register if privilege rules 
and descriptor type allow. 
Set zero flag if successful. 


Load Access Rights: reads 
the descriptor access 
rights byte into the register 
lif privilege rules allow. Set 
zero flag if successful. 
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4.4.3.4 PRIVILEGE VALIDATION 


The 386 DX provides several instructions to speed 
pointer testing and help maintain system integrity by 
verifying that the. selector value refers to an appro- 
priate segment. Table 4-2 summarizes the selector 


a TF 


validation procedures avaiiabie for ine 386 VA. 


This pointer verification prevents the common prob- 
lem of an application at PL = 3 calling a operating 
systems routine at PL = O and passing the operat- 


ing system routine a “‘bad”’ pointer which corrupts a 


data structure belonging to the operating system. If. 
the operating system routine uses the ARPL instruc- 
tion to ensure that the RPL of the selector has no 


greater privilege than that of the caller, then this 


problem can be avoided. 


4.4.3.5 DESCRIPTOR ACCESS 


There are basically two types of segment accesses: 
those involving code segments such as control 
transfers, and those involving data accesses. Deter- 
mining the ability of a task to access a segment in- 
volves the type of segment to be accessed, the in- 
struction used, the type of descriptor used and CPL, 
RPL, and DPL as described above. 


Any time an instruction loads data segment registers 
(DS, ES, FS, GS) the 386 DX makes protection vali- 
dation checks. Selectors loaded in the DS, ES, FS, 
GS registers must refer only to data segments or 
readable code segments. The data access rules are 
specified in section 4.2.2 Rules of Privilege. The 
only exception to those rules is readable conforming 
code segments which can be poGeeeee at. al privi- 
lege level. | 


Finally the privilege validation checks are performed. 
The CPL is compared to the EPL and if the EPL is 
more privileged than the CPL an Sxcepuen: 13 (gen- 


_ eral protection fault) is generated. 


The rules regarding the stack segment are slightly 
different than those involving data segments. In- 
structions that load selectors into SS must refer to 
data segment descriptors for writeable data seg- 
ments. The DPL and RPL must equal the CPL. All 
other descriptor types or a privilege level violation 
will cause exception 13. A stack not present fault 
causes exception 12. Note that an exception 11 is 
used for a not-present code or data segment. 


4.4.4 Privilege Level Transfers 


Inter-segment control transfers occur when a selec- 
tor is loaded in the CS register. For a typical system 
most of these transfers are simply the result of a call 
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Table 4-3. Descriptor Types Used for Control Transfer 


Descriptor | Descriptor 
Control Transfer Types | Operation Types 
Intersegment to the same or higher privilege level 


Call Gate GDT/LDT 


Trap or | 
Interrupt | 
Gate 


Intersegmeni within the same privilege level JMP, CALL, RET, IRET* | Code Segment | GDT/LDT 
Interrupt within task may change CPL 


Interrupt Instruction, 
Exception, External 
Interrupt 


RET, |RET* 


Intersegment to a lower privilege level 
(changes task CPL) 


Code Segment | GDT/LDT 
Task State 
Segment 


| 
G 


CALL, JMP 


| CALL, JMP 


| IRET** 
Interrupt Instruction, 
Exception, External 
Interrupt 


T 
DT 
Task Switch 
T 


*NT (Nested Task bit of flag register 


)=0 
**NT (Nested Task bit of flag register) = 1 


or a jump to another routine. There are five types of 
control transfers which are summarized in Table 4-3. 
Many of these transfers result in a privilege level 
transfer. Changing privilege levels is done only via 
control transfers, by using gates, task switches, and 
interrupt or trap gates. 


Control transfers can only occur if the operation 
which loaded the selector references the correct de- 
scriptor type. Any violation of these descriptor usage 
rules will cause an exception 13 (e.g. JMP through a 
call gate, or IRET from a normal subroutine cail). 


In order to provide further system security, all control 
transfers are also subject to the privilege rules. 


The privilege rules require that: | 


— Privilege level transitions can only occur via 
gates. 


— JMPs can be made to a non-conforming code 
segment with the same privilege or to a conform- 
ing code segment with greater or equal privilege. 


-—— CALLs can be made to a non-conforming code 
segment with the same privilege or via a gate to a 
more privileged level. 


— Interrupts handled within the task obey the same 
privilege rules as CALLs. 


— Conforming Code segments are accessible by 


privilege levels which are the same or less privi- 


leged than the conforming-code segment’s DPL. 


— Both the requested privilege level (RPL) in the 
selector pointing to the gate and the task’s CPL 


D 
Task Gate 


must be of equal or greater privilege than the 
_gate’s DPL. 7 


— The code segment selected in the gate must be 
the same or more privileged than the task’s CPL. 


— Return instructions that do not switch tasks can 
only return control to a code segment with same 
or less privilege. 


— Task switches can be performed by a CALL, 
JMP, or INT which references either a task gate 
or task state segment who’s DPL is less privi- 
leged or the same privilege as the old task’s CPL. 


Any control transfer that changes CPL within a task 
causes a change of stacks as a result of the privi- 
lege level change. The initial values of SS:ESP for 
privilege levels 0, 1, and 2 are retained in the task 
state segment (see section 4.4.6 Task Switching). 
During a JMP or CALL control transfer, the new 
stack pointer is loaded into the SS and ESP regis- 
ters and the previous stack pointer is pushed onto 
the new stack. 7 


When RETurning to the original privilege level, use 
of the lower-privileged stack is restored as part of 
the RET or IRET instruction operation. For subrou- 
tine calls that pass parameters on the stack and 
cross privilege levels, a fixed number of words (as 
specified in the gate’s word count field) are copied 
from the previous stack to the current stack. The 
inter-segment RET instruction with a stack adjust- 
ment value will correctly restore the previous stack 


pointer upon return. 
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Figure 4-15a. 386™ DX TSS and TSS Registers 
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Figure 4-15b. Sample I/O Permission Bit Map 


4.4.5 Call Gates 


Gates provide protected, indirect CALLs. One of the 
major uses of gates is to provide a secure method of 
privilege transfers within a task. Since the operating 
system defines all of the gates in a system, it can 
ensure that all gates only allow entry into a few trust- 
ed procedures (such as those which allocate memo- 
ry, or perform 1/O). 


Gate descriptors follow the data access rules of priv- 
ilege; that is, gates can be accessed by a task if the 
EPL, is equal to or more privileged than the gate 
descriptor’s DPL. Gates follow the control transfer 
rules of privilege and therefore may only transfer 
control to a more privileged level. 


Cail Gates are accessed via a CALL instruction and 
are syntactically identical to calling a normal subrou- 
tine. When an inter-level 386 DX call gate is activat- 
ed, the following actions occur. 


1. Load CS:EIP from gate check for validity 
2. SS is pushed zero-extended to 32 bits 
3. ESP is pushed 


4. Copy Word Count 32-bit parameters from the 
old stack to the new stack 


. Push Return address on stack 


The procedure is identical for 80286 Call gates, ex- 
cept that 16-bit parameters are copied and 16-bit 
registers are pushed. 


Interrupt Gates and Trap gates work in a similar 
fashion as the call gates, except there is no copying 
of parameters. The only difference between Trap 
and Interrupt gates is that control transfers through 
an Interrupt gate disable further interrupts (i.e. the IF 
bit is set to 0), and Trap gates leave the interrupt 
status unchanged. 


4.4.6 Task Switching 


Avery important attribute of any multi-tasking/multi- 
user operating systems is its ability to rapidly switch 
between tasks or processes. The 386 DX directly 
supports this operation by providing a task switch 
instruction in hardware. The 386 DX task switch op- 


eration saves the entire state of the machine (all of 
the registers, address space, and a link to the previ- 
ous task); loads a new execution state, performs 
protection checks, and commences execution in the 
new task, in about 17 microseconds. Like transfer of 
control via gates, the task switch operation is in- 
voked by executing an inter-segment JMP or CALL 
instruction which refers to a Task State Segment 
(TSS), or a task gate descriptor in the GDT or LDT. 
An INT n instruction, exception, trap, or external in- 
terrupt may also invoke the task switch operation if 
there is a task gate descriptor in the associated IDT 
descriptor slot. 


The TSS descriptor points to a segment (see Figure 
4-15) containing the entire 386 DX execution state 
while a task gate descriptor contains a TSS selector. 
The 386 DX supports both 80286 and 386 DX style 
TSSs. Figure 4-16 shows a 80286 TSS. The limit of 
a 386 DX TSS must be greater than 0064H (002BH 
for a 80286 TSS), and can be as large as 4 Giga- 
bytes. In the additional TSS space, the operating 
system is free to store additional information such as 
the reason the task is inactive, time the task has 
spent running, and open files belong to the task. 


Each task must have a TSS associated with it. The 
current TSS is identified by a special register in the 


386 DX called the Task State Segment Register 


(TR). This register contains a selector referring to 
the task state segment descriptor that defines the 
current TSS. A hidden base and limit register associ- 
ated with TR are loaded whenever TR is loaded with 
a new selector. Returning from a task is accom- 
plished by the IRET instruction. When IRET is exe- 
cuted, control is returned to the task which was in- 
terrupted. The current executing task’s state is 


saved in the TSS and the old task state is restored 


from its TSS. 


Several bits in the flag register and machine status 
word (CRO) give information about the state of a 
task which are useful to the operating system. The 
Nested Task (NT) (bit 14 in EFLAGS) controls the 
function of the IRET instruction. If NT = 0, the IRET 
instruction performs the regular return; when NT = 
1, IRET performs a task switch operation back to the 
previous task. The NT bit is set or reset in the follow- 
ing fashion: 
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Figure 4-16. 80286 TSS 


When a CALL or INT instruction initiates a task 
switch, the new TSS will be marked busy and the 
back link field of the new TSS set to the old TSS 
selector. The NT bit of the new task is set by CALL 
or INT initiated task switches. An interrupt that does 
not cause a task switch will clear NT. (The NT bit will 
be restored after execution of the interrupt handler) 


NT may also be set or cleared by POPF or IRET © 


instructions. 


The 386 DX task state segment is marked busy by 
changing the descriptor type field from TYPE 9H to 
TYPE BH. An 80286 TSS is marked busy by chang- 
ing the descriptor type field from TYPE 1 to TYPE 3. 
Use of a selector that references a busy task state 
segment causes an exception 13. 


The Virtual Mode (VM) bit 17 is used to indicate if a 
task, is a virtual 8086 task. If VM = 1, then the tasks 
will use the Real Mode addressing mechanism. The 
virtual 8086 environment is only entered and exited 
via a task switch (see section 4.6 Virtual Mode). 


The coprocessor’s state is not automatically saved 
when a task switch occurs, because the incoming 
task may not use the coprocessor. The Task 
Switched (TS) Bit (bit 3 in the CRO) helps deal with 
the coprocessor’s state in a multi-tasking environ- 


ment. Whenever the 386 DX switches tasks, it sets 
the TS bit. The 386 DX detects the first use of a | 
processor extension instruction after a task switch 
and causes the processor extension not available 
exception 7. The exception handler for exception 7 
may then decide whether to save the state of the 
coprocessor. A processor extension not present ex- 
ception (7) will occur when attempting to execute an 
ESC or WAIT instruction if the Task Switched and 
Monitor coprocessor extension bits are both set (i.e. 
TS = 1 and MP = 1). 


The T bit in the 386 DX TSS indicates that the proc- 
essor should generate a debug exception when 
switching to a task. If T = 1 then. upon entry to a 


new task a debug exception 1 will be generated. 


4.4.7 Initialization and Transition to 
Protected Mode ; 


Since the 386 DX begins executing in Real Mode 
immediately after RESET it is necessary to initialize 
the system tables and registers with the appropriate 
values. | 


The GDT and IDT registers must refer to a valid GDT 
and IDT. The IDT should be at least 256 bytes long, 
and GDT must contain descriptors for the initial 
code, and data segments. Figure 4-17 shows the 
tables and Figure 4-18 the descriptors needed for a 
simple Protected Mode 386 DX system. It has a sin- 
gle code and single data/stack segment each four 
gigabytes long and a single privilege level PL = 0. 


The actual method of enabling Protected Mode is to | 
load CRO with the PE bit set, via the MOV CRO, R/M 
instruction. This puts the 386 DX in Protected Mode. 


After enabling Protected Mode, the next instruction 
should execute an intersegment JMP to load the CS 
register and flush the instruction decode queue. The 
final step is to load all of the data segment registers 
with the initial selector values. | . 


An alternate approach to entering Protected Mode 
which is especially appropriate for multi-tasking op- 
erating systems, is to use the built in task-switch to 
load all of the registers. In this case the GDT would 
contain two TSS descriptors in addition to the code 
and data descriptors needed for the first task. The 
first JMP instruction in Protected Mode would jump 
to the TSS causing a task switch and loading all of 
the registers with the values stored in the TSS. The 
Task State Segment Register should be initialized to 
point to a valid TSS descriptor since a task switch 
saves the state of the current task in a task state 
segment. : : 
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Figure 4-17. Simple Protected System 
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Figure 4-18. GDT Descriptors for Simple System 


4.4.8 Tools for Building Protected 
Systems 


In order to simplify the design of a protected multi- 
tasking system, Intel provides a tool which allows 
the system designer an easy method of constructing 
the data structures needed for a Protected Mode 
386 DX system. This tool is the builder BLD-386™. 
BLD-386 lets the operating system writer specify all 
of the segment descriptors discussed in the previous 
sections (LDTs, IDTs, GDTs, Gates, and TSSs) ina 
high-level language. | 


4.5 PAGING 


4.5.1 Paging Concepts 


Paging is another type of memory management use- 
ful for virtual memory multitasking operating sys- 
tems. Unlike segmentation which modularizes pro- 
grams and data into variable length segments, 
paging divides programs into multiple uniform size — 
pages. Pages bear no direct relation to the logical 
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structure of a program. While segment selectors can 
be considered the logical “name” of a program 


module or data structure, a page most likely corre-_ . 


sponds to only a portion of a module or data struc- 
ture. 


By taking advantage of the locality of reference dis- 
played by most programs, only a small number of 
pages from each active task need be in memory at 
any one moment. 


4.5.2 Paging Organization 


4.5.2.1 PAGE MECHANISM 


- The 386 DX uses two levels of tables to translate 
the linear address (from the segmentation unit) into 
a physical address. There are three components to 
the paging mechanism of the 386 DX: the page di- 
rectory, the page tables, and the page itself (page 
frame). All memory-resident elements of the 386 DX 
paging mechanism are the same size, namely, 4K 


bytes. A uniform size for all of the elements simpli- | 


fies memory allocation and reallocation schemes, 
since there is no problem. with memory fragmenta- 
tion. Figure 4-19 shows how the paging mechanism 
works. . , 
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4.5.2.2 PAGE DESCRIPTOR BASE REGISTER 


CR2 is the Page Fault Linear Address register. It 
holds the 32-bit linear address which caused the last 


page fault detected. 


CR is the Page Directory Physical Base Address 


Register. It contains the physical starting address of 
the Page Directory. The lower 12 bits of CR3 are | 
always zero to ensure that the Page Directory is al- 
ways page aligned. Loading it via a MOV CR3, reg 
instruction causes the Page Table Entry cache to be 
flushed, as will a task switch through a.TSS which 
changes the value of CRO. (See 4.5.4 Translation 
Lookaside Buffer). 


4.5.2.3 PAGE DIRECTORY 


The Page Directory is 4K bytes long and allows up to 
1024 Page Directory Entries. Each Page Directory 


Entry contains the address of the next. level of ta- 


bles, the Page Tables and information about the 
page table. The contents of a Page Directory Entry 
are shown in Figure 4-20. The upper 10 bits of the 
linear address (A22—A31) are used as an index to 
select the correct Page Directory Entry. 


TWO LEVEL PAGING SCHEME 


LINEAR 
ADDRESS 


1 


DIRECTORY 
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Figure 4-19. Paging Mechanism 


OS 


| RESERVED 


12 11 10° 


Figure 4-20. Page Directory Entry (Points to Page Table) 
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Figure 4-21. Page Table Entry (Points to Page) 


4.5.2.4 PAGE TABLES 


Each Page Table is 4K bytes and holds up to 1024 
Page Table Entries. Page Table Entries contain the 
starting address of the page frame and statistical 
information about the page (see Figure 4-21). Ad- 
dress bits A12-A21 are used as an index to select 
one of the 1024 Page Table Entries. The 20 upper- 
bit page frame address is concatenated with the 
lower 12 bits of the linear address to form the physi- 
cal address. Page tables can be shared between 
tasks and swapped to disks. 


4.5.2.5 PAGE DIRECTORY/TABLE ENTRIES 


The lower 12 bits of the Page Table Entries and 
Page Directory Entries contain statistical information 
about pages and page tables respectively. The P 
(Present) bit 0 indicates if a Page Directory or Page 
Table entry can be used in address translation. If 
P = 1 the entry can be used for address translation; 
if P = O the entry can not be used for translation. 
Note that the present bit of the page table entry that 
points to the page where code is currently being ex- 
ecuted should always be set. Code that marks its 
own page not present should not be written. All of 
the other bits are available for use by the software. 
For example the remaining 31 bits could be used to 
indicate where on the disk the page is stored. 


The A (Accessed) bit 5, is set by the 386 DX for both 


types of entries before a read or write access occurs | 


to an address covered by the entry. The D (Dirty) bit 
6 is set to 1 before a write to an address covered by 
that page table entry occurs. The D bit is undefined 
for Page Directory Entries. When the P, A and D bits 
are updated by the 386 DX, the processor generates 
a Read-Modify-Write cycle which locks the bus and 
prevents conflicts with other processors or perpheri- 
als. Software which modifies these bits should use 
the LOCK prefix to ensure the integrity of the page 
tables in multi-master systems. 


The 3 bits marked OS Reserved in Figure 4-20 and 
Figure 4-21 (bits 9-11) are software definable. OSs 
are free to use these bits for whatever purpose they 
wish. An example use of the OS Reserved bits 
would be to store information about page aging. By 
keeping track of how long a page has been in mem- 
ory since being accessed, an operating system can 
_ implement a page replacement algorithm like Least 
Recently Used. 


The (User/Supervisor) U/S bit 2 and the (Read/ 
Write) R/W bit 1 are used to provide protection attri- 
butes for individual pages. 


4.5.3 Page Level Protection 
(R/W, U/S Bits) 


The 386 DX provides a set of protection attributes 
for paging systems. The paging mechanism distin- 
guishes between two levels of protection: User 
which corresponds to level 3 of the segmentation 
based protection, and supervisor which encompass- 
es all of the other protection levels (0, 1, 2). Pro- 
grams executing at Level 0, 1 or 2 bypass the page 
protection, although segmentation based protection 
is still enforced by the hardware. 


The U/S and R/W bits are used to provide Us- 
er/Supervisor and Read/Write protection for individ- 
ual pages or for all pages covered by a Page Table 
Directory Entry. The U/S and R/W bits in the first 
level Page Directory Table apply to all pages de- 
scribed by the page table pointed to by that directory 
entry. The U/S and R/W bits in the second level 
Page Table Entry apply only to the page described 
by that entry. The U/S and R/W bits for a given 
page are obtained by taking the most restrictive of 
the U/S and R/W from the Page Directory Table 
Entries and the Page Table Entries and using these 
bits to address the page. 


Example: If the U/S and R/W bits for the Page Di- 
rectory entry were 10 and the U/S and R/W bits for 
the Page Table Entry were 01, the access rights for 
the page would be 01, the numerically smaller of the 
two. Table 4-4 shows the affect of the U/S and R/W 
bits on accessing memory. 


Table 4-4. Protection Provided by R/W and U/S 


Permitted Permitted Access. | 
U/S | R/W Levels 0, 1, or 2 


None Read/Write 
None Read/Write 
Read-Only Read/Write 


Read/Write Read/Write 


However a given segment can be easily made read- 
- only for level 0, 1, or 2 via the use of segmented 
protection mechanisms. (Section 4.4 Protection). 
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4.5.4 Translation Lookaside Buffer 


The 386 DX paging hardware is designed to support 
demand paged virtual memory systems. However, 
performance would degrade substantially if the proc- 
essor was required to access two levels of tables for 
every memory reference. To solve this problem, the 
386 DX keeps a cache of the most recently ac- 
cessed pages, this cache is called the Translation 
Lookaside Buffer (TLB).. The TLB is a four-way set 
associative 32-entry page table cache. It automati- 
cally keeps the most commonly used Page Table 
Entries in the processor. The 32-entry TLB coupled 
with a 4K page size, results in coverage of 128K 
bytes of memory addresses. For many common mul- 
ti-tasking systems, the TLB will have a hit rate of 
about 98%. This means that the processor will only 
have to access the two-level page structure on 2% 
of all memory references. Figure 4-22 illustrates how 
the TLB complements the 386 DX’s paging mecha- 
nism. 


4.5.5 Paging Operation 


32 ENTRIES 
PHYSICAL 
MEMORY . 
TRANSLATION 
LOOKASIDE 
_ BUFFER 


LINEAR 
- ADDRESS © 


PAGE 


PAGE . 
DIRECTORY TABLE 
 @ 98% HIT RATE 


231630-68 | 
Figure 4-22. Translation Lookaside Buffer 


The paging hardware operates in the following fash- 
ion. The paging unit hardware receives a 32-bit lin- 
ear address from the segmentation unit. The upper 
20 linear address bits are compared with all 32 en- 
tries in the TLB to determine if there is a match. If 
there is a match (i.e. a TLB hit), then the 32-bit phys- 
ical address is calculated and will be placed on the 
address bus. 


However, if the page table entry is not in the TLB, 
the 386 DX will read the appropriate Page Directory 
Entry. If P = 1 on the Page Directory Entry indicat- 
ing that the page table is in memory, then the 386 
DX will read the appropriate Page Table Entry 
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and set the Access bit. If P = 1 on the Page Table 

Entry indicating that the page is in memory, the 386 
DX will update the Access and Dirty bits as needed 
and fetch the operand. The upper 20 bits of the lin- 
ear address, read from the page table, will be stored 


‘in the TLB for future accesses. However, if P = 0 for 


either the Page Directory Entry or the Page Table 
Entry, then the processor will generate a page fault, 
an Exception 14. | 


The processor will also generate an exception 14, 
page fault, if the memory reference violated the 
page protection attributes (i.e. U/S or R/W) (e.g. try- 
ing to write to a read-only page). CR2 will hold the 


_ linear address which caused the page fault. If a sec- 
.ond page fault occurs, while the processor is. at- 


tempting to enter the service routine for the first, 
then the processor will invoke the page fault (excep- 
tion 14) handler a second time, rather than the dou- 
ble fault (exception 8) handler. Since Exception 14 is 
classified as a fault, CS: EIP will point to the instruc- 
tion causing the page fault. The 16-bit error code 
pushed as part of the page fault handler will contain 
status bits which indicate the cause of the page 
fault. 


The 16-bit error code is used by the operating sys- 
tem to determine how to handle the page fault Fig- 
ure 4-23A shows the format of the page-fault error 
code and the interpretation of the bits. — 


| _ NOTE: | | 
Even though the bits in the error code (U/S, W/R, 
and P) have similar names as the bits in the Page 
Directory/Table Entries, the interpretation of the er- 
ror code bits is different. Figure 4-23B indicates 
what type of access caused the page fault. 


3210 


ujujufujujufusu}ufuluyu]ufuy 
S A 
Figure 4-23A. Page Fault Error Code Format | 


U/S: The U/S bit indicates whether the access 
causing the fault occurred when the processor was 
executing in User Mode (U/S = 1) or in Supervisor 
mode (U/S = 0) 


W/R: The W/R bit indicates whether the access 
causing the fault was a Read OM R = 0) or a Write 
(W/R = 1) 


P: The P bit indicates whether a page fault was 
caused by a not-present page (P = 0), or ee a page 
level protection violation (P = 1) 


U: UNDEFINED 
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Access Type 


Supervisor* Read 


Supervisor Write 
User Read 
User Write 


*Descriptor table access will fault with U/S = 0, even if the program 
is executing at level 3. 


Figure 4-23B. Type of Access 
Causing Page Fault 


4.5.6 Operating System 
Responsibilities 


The 386 DX takes care of the page address transla- 
tion process, relieving the burden from an operating 
system in a demand-paged system. The operating 
system is responsible for setting up the initial page 
tables, and handling any page faults. The operating 
system also is required to invalidate (i.e. flush) the 
TLB when any changes are made to any of the page 
table entries. The operating system must reload 
CR3 to cause the TLB to be flushed. 


Setting up the tables is simply a matter of loading 
CR3 with the address of the Page Directory, and 
allocating space for the Page Directory and the 
Page Tables. The primary responsibility of the oper- 
ating system is to implement a swapping policy and 
handle all of the page faults. 


A final concern of the operating system is to ensure 
that the TLB cache matches the information in. the 
paging tables. In particular, any time the operating 
system sets the P present bit of page table entry to 
zero, the TLB must be flushed. Operating systems 
may want to take advantage of the fact that CR3 is 
stored as part of a TSS, to give every task or group 
of tasks its own set of page tables. 


4.6 VIRTUAL 8086 ENVIRONMENT 


4.6.1 Executing 8086 Programs 


The 386 DX allows the execution of 8086 application 
programs in both Real Mode and in the Virtual 8086 
Mode (Virtual Mode). Of the two methods, Virtual 
8086 Mode offers the system designer the most 
flexibility. The Virtual 8086 Mode allows the execu- 
tion of 8086 applications, while still allowing the sys- 
tem designer to take full advantage of the 386 DX 
protection mechanism. In particular, the 386 DX al- 
lows the simultaneous execution of 8086 operating 
systems and its applications, and a 386 DX operat- 
ing system and both 80286 and 386 DX appli- 
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cations. Thus, in a multi-user 386 DX computer, one 
person could be running an MS-DOS spreadsheet, 
another person using MS-DOS, and a third person 


_could be running multiple Unix utilities and applica- 


tions. Each person in this scenario would believe 
that he had the computer completely to himself. Fig- 
ure 4-24 illustrates this concept. 


4.6.2 Virtual 8086 Mode Addressing 
Mechanism 


One of the major differences between 386 DX Real 
and Protected modes is how the segment selectors 
are interpreted. When the processor is executing in 
Virtual 8086 Mode the segment registers are used in 
an identical fashion to Real Mode. The contents of 
the segment register is shifted left 4 bits and added 
to the offset to form the segment base linear ad- 
dress. 


The 386 DX allows the operating system to specify 
which programs use the 8086 style address mecha- 
nism, and which programs use Protected Mode ad- 
dressing, on a per task basis. Through the use of 
paging, the one megabyte address space of the Vir- 
tual Mode task can be mapped to anywhere in the 4 
gigabyte linear address space of the 386 DX. Like 
Real Mode, Virtual Mode effective addresses (i.e., 
segment offsets) that exceed 64K byte will cause an 
exception 13. However, these restrictions should not 
prove to be important, because most tasks running 
in Virtual 8086 Mode will simply be existing 8086 
application programs. 


4.6.3 Paging In Virtual Mode 


The paging hardware allows the concurrent running 
of multiple Virtual Mode tasks, and provides protec- 
tion and operating system isolation. Although it is 
not strictly necessary to have the paging hardware 
enabled to run Virtual Mode tasks, it is needed in 
order to run multiple Virtual Mode tasks or to relo- 
cate the address space of a Virtual Mode task to 
physical address space greater than one megabyte. 


The paging hardware allows the 20-bit linear ad- 
dress produced by a Virtual Mode program to be 
divided into up to 256 pages. Each one of the pages 
can be located anywhere within the maximum 4 giga- 
byte physical address space of the 386 DX. In addi- 
tion, since CR3 (the Page Directory Base Register) 
is loaded by a task switch, each Virtual Mode task 
can use a different mapping scheme to map pages 
to different physical locations. Finally, the paging 
hardware allows the sharing of the 8086 operating 


Intel 386™ DX MICROPROCESSOR 


8086 OS 


EMPTY 


TASK 2 PAGE 
~. TABLE 


VIRTUAL MODE == DIRECTORY 


8086 TASK TASK 2 


PAGE N ze 
PAGE 1 ; 


8086 OS 


EMPTY 
PAGE DIRECTORY TASK 1 PAGE | 
~~ ROOT TABLE 


VIRTUAL MODE PAGE DIRECTORY 
8086 TASK © TASK 1 


Figure 4-24. Virtual 8086 Environment Memory Management 


system code between multiple 8086 applications. 
Figure 4-24 shows how the 386 DX paging hardware 
enables multiple 8086 programs to run under a virtu- 
al memory demand paged system. 


4.6.4 Protection and I/O Permission 
Bitmap 


All Virtual 8086 Mode programs execute at privilege 
level 3, the level of least privilege. As such, Virtual 
8086 Mode programs are subject to all of the protec- 
tion checks defined in Protected Mode. (This is dif- 
ferent from Real Mode which implicitly is executing 
at privilege level 0, the level of greatest privilege.) 
Thus, an attempt to execute a privileged instruction 
when in Virtual 8086 Mode will cause an exception . 
13 fault. : 


The following are privileged instructions, which may 
be executed only at Privilege Level 0. Therefore, at- 
tempting to. execute these instructions in Virtual 
8086 Mode (or anytime CPL > 0) causes an excep- 
tion 13 fault: | 


LIDT; MOV DRn,reg; MOV reg,DRn; 
LGDT ; MOV TRn,reg; MOV reg,TRn; 


_ PHYSICAL 
MEMORY 


7 x anal 
YI 


MA 
Wt 


AVAILABLE | 


we 


sl 00000000(H) 


TASK 1 8086 OS 
MEMORY MEMORY 


| TT TASK 2 83 386 px cpu Os 
MEMORY SN MEMORY 
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LMSW; ° MOV CRn,reg; MOV reg,CRn. 
CLTS; | | 
HLT ; 


Several instructions, particularly those applying to 
the multitasking model and protection model, are 
available only in Protected Mode. Therefore, at- 
tempting to execute the following instructions in 
Real Mode or in Virtual 8086 Mode generates an 
exception 6 fault: | : 


LTR; STR; 
LLDT ; SLDT ; 
LAR ; VERR ; 
LSL ; VERW ; 
ARPL. 


The instructions which are |OPL-sensitive in Protect- 
ed Mode are: 


IN; STI; 
OUT; CLI 
INS; 

OUTS; | 

REP INS; 

REP OUTS; 
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In Virtual 8086 Mode, a slightly different set of in- 
structions are made IOPL-sensitive. The following in- 
structions are |OPL-sensitive in Virtual 8086 Mode: 


INT n;_ . STI; 
PUSHF ; CLI ; 
POPF ; IRET 


The PUSHF, POPF, and IRET instructions are |OPL- 
sensitive in Virtual 8086 Mode only. This provision 
allows the IF flag (interrupt enable flag) to be virtual- 
ized to the Virtual 8086 Mode program. The INT n 
software interrupt instruction is also |OPL-sensitive 
in Virtual 8086 Mode. Note, however, that the INT 3 
(opcode OCCH), INTO, and BOUND instructions are 
not IOPL-sensitive in Virtual 8086 mode (they aren’t 
IOPL sensitive in Protected Mode either). 


Note that the I/O instructions (IN, OUT, INS, OUTS, 
REP INS, and REP OUTS) are not IOPL-sensitive in 
Virtual 8086 mode. Rather, the |/O instructions be- 
come automatically sensitive to the 1/O Permission 
Bitmap contained in the 386 DX Task State Seg- 
ment. The |/O Permission Bitmap, automatically 
used by the 386 DX in Virtual 8086 Mode, is illustrat- 
ed by Figures 4.15a and 4-15b. 


The !/O Permission Bitmap can be viewed as a 0- 
64 Kbit bit string, which begins in memory at offset 
Bit_Map__Offset in the current TSS. Bit_Map__ 
Offset must be < DFFFH so the entire bit map and 
the byte FFH which follows the bit map are all at 
offsets < FFFFH from the TSS base. The 16-bit 
pointer Bit_.Map__Offset (15:0) is found in the word 
beginning at offset 66H (102 decimal) from the TSS 
base, as shown in Figure 4-15a. | 


Each bit in the I/O Permission Bitmap corresponds 
to a single byte-wide I/O port, as illustrated in Figure 
4-15a. If a bit is 0, 1/O to the corresponding byte- 
wide port can occur without generating an excep- 
tion. Otherwise the I/O instruction causes an excep- 
tion 13 fault. Since every byte-wide I/O port must be 
protectable, all bits corresponding to a word-wide or 
dword-wide port must be O for the word-wide or 
dword-wide |/O to be permitted. If all the referenced 


bits are 0, the I/O will be allowed. If any referenced 


~ bits are 1, the attempted I/O will cause an exception 
13 fault. — : 


Due to the use of a pointer to the base of the I/O 
Permission Bitmap, the bitmap may be located any- 
where within the TSS, or may be ignored completely 
by pointing the Bit_.Map__Offset (15:0) beyond the 
limit of the TSS segment. In the same manner, only 
a small portion of the 64K I/O space need have an 
associated map bit, by adjusting the TSS limit to 
truncate the bitmap. This eliminates the commitment 
of 8K of memory when a complete bitmap is not 
required, while allowing the fully general case if 
desired. 
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EXAMPLE OF BITMAP FOR 1/O PORTS 0-255: 
Setting the TSS limit to {bit_Map__Offset + 31 
+1**} [** see note below] will allow a 32-byte bit- 
map for the I/O ports #0-—255, plus a terminator 
byte of all 1’s [** see note below]. This allows the 
1/O bitmap to control 1/O Permission to !/O port 0- 
255 while causing an exception 13 fault on attempt- 
ed I/O to any |/O port 80256 through 65,565. — 


**IMPORTANT IMPLEMENTATION NOTE: Beyond 
the last byte of |/O mapping information in the I/O 
Permission Bitmap must be a byte containing all 1’s. 
The byte of ail 1’s must-be within the limit of the 386 
DX TSS segment (see Figure 4-15a). 


4.6.5 Interrupt Handling 


In order to fully support the emulation of an 8086 


-machine, interrupts in Virtual 8086 Mode are han- 


dled in a unique fashion. When running in Virtual 
Mode ail interrupts and exceptions involve a privi- 
lege change back to the host 386 DX operating sys- 
tem. The 386 DX operating system determines if the 
interrupt comes from a Protected Mode application 
or from a Virtual Mode program by examining the 
VM bit in the EFLAGS image stored on the stack. 


When a Virtual Mode program is interrupted and ex- 
ecution passes to the interrupt routine at level 0, the 
VM bit is cleared. However, the VM bit is still set in 
the EFLAG image on the stack. 


The 386 DX operating system in turn handles the 
exception or interrupt and then returns control to the 
8086 program. The 386 DX operating system may 
choose to let the 8086 operating system handle the 
interrupt or it may emulate the function of the inter- 
rupt handler. For example, many 8086 operating 
system calls are accessed by PUSHing parameters 
on the stack, and then executing an INT n instruc- 
tion. If the |OPL is set to 0 then all INT n instructions 
will be intercepted by the 386 DX Microprocessor 
operating system. The 386 DX operating system 
could emulate the 8086 operating system’s call. Fig- 
ure 4-25 shows how the 386 DX operating system 
could intercept an 8086 operating system’s call to 
“Open a File”. | | 


A 386 DX operating system can provide a Virtual 
8086 Environment which is totally transparent to the 
application software via intercepting and then emu- 
lating 8086 operating system’s calls, and intercept- 
ing IN and OUT instructions. 
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4.6.6 Entering and Leaving Virtual 
8086 Mode 


Virtual 8086 mode is entered by executing an IRET 
instruction (at CPL=0), or Task Switch (at any CPL) 
to.a 386 DX task whose 386 DX TSS has a FLAGS 
image containing a 1 in the VM bit position while the 
processor is executing in Protected Mode. That is, 
one way to enter Virtual 8086 mode is to switch to a 
task with a 386 DX TSS that has a 1 in the VM bit in 
the EFLAGS image. The other way is to execute a 
32-bit IRET instruction at privilege level 0, where the 
stack has a 1 in the VM bit in the EFLAGS image. 
POPF does not affect the VM bit, even if the proces- 
sor is in Protected Mode or levei 0, and so cannot be 
used to enter Virtual 8086 Mode. PUSHF always 
pushes a 0 in the VM bit, even if the processor is in 
Virtual 8086 Mode, so that a program cannot tell if it 
_ is executing in REAL mode, or in Virtual 8086 mode. 


The VM bit can be set by executing an IRET instruc- 
tion only at privilege level 0, or by any instruction or 
‘Interrupt which causes a task switch in Protected 
Mode (with VM=1 in the new FLAGS image), and 
can be cleared only by an interrupt or exception in 
Virtual 8086 Mode. IRET and POPF instructions exe- 
cuted in REAL mode or Virtual 8086 mode will not 
change the value in the VM bit. | 


The transition out of virtual 8086 mode to 386 DX 
protected mode occurs only on receipt of an inter- 
_ rupt or exception (such as due to a sensitive instruc- 

tion). In Virtual 8086 mode, ail interrupts and excep- 
tions vector through the protected mode IDT, and 
enter an interrupt handier in protected 386 DX 
mode. That is, as part of interrupt processing, the 
VM bit is cleared. | 


Because the matching IRET must occur from level 0, 
if an Interrupt or Trap Gate is used to.field an inter- 
rupt or exception out of Virtual 8086 mode, the Gate 
must perform an inter-level interrupt only to level 0. 
Interrupt or Trap Gates through conforming seg- 
ments, or through segments with DPL> 0, will raise a 
GP fault with the CS selector as the error code. 


4.6.6.1 TASK SWITCHES TO/FROM VIRTUAL 
8086 MODE . 


Tasks which can execute in virtual 8086 mode must 
be described by a TSS with the new 386 DX format 
(TYPE 9 or 11 descriptor). 


_ A task switch out of virtual 8086 mode will operate 
exactly the same as any other task switch out of a 
task with a 386 DX TSS. All of the programmer visi- 
ble state, including the FLAGS register with the VM 
bit set to 1, is stored in the TSS. The segment 
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registers in the TSS will contain 8086 segment base 
values rather than selectors. 


A task switch into a task described by a 386 DX TSS © 
will have an additional check to determine if the in- 
coming task should be resumed in virtual 8086 
mode. Tasks described by 80286 format TSSs can- 
not be resumed in virtual 8086 mode, so no check is 
required there (the FLAGS image in 80286 format 
TSS has only the low order 16 FLAGS bits). Before 
loading the segment register images from a 386 DX 
TSS, the FLAGS image is loaded, so that the seg- 
ment registers are loaded from the TSS image as 
8086 segment base values. The task is now ready to 
resume in virtual 8086 execution mode. _ | 


4.6.6.2 TRANSITIONS THROUGH TRAP AND 
INTERRUPT GATES, AND IRET 


A task switch is one way to enter or exit virtual 8086 
mode. The other method is to exit through a Trap or 
Interrupt gate, as part of handling an interrupt, and 
to enter as part of executing an IRET instruction. 
The transition out must use a 386 DX Trap Gate 
(Type 14), or 386 DX Interrupt Gate (Type 15), which | 
must point to a non-conforming level 0 segment 
(DPL= 0) in order to permit the trap handler to IRET 


-back to the Virtual 8086 program. The Gate must 


point to a non-conforming level 0 segment to per- 
form a level switch to level 0 so that the matching - 
IRET can change the VM bit. 386 DX gates must be 
used, since 80286 gates save only the low 16 bits of 
the FLAGS register, so that the VM bit will not be 
saved on transitions through the 80286 gates. Also, 
the 16-bit IRET (presumably) used to terminate the 
80286 interrupt handler will pop only the lower 16 
bits from FLAGS, and will not affect the VM bit. The 
action taken for a 386 DX Trap or Interrupt gate if an 
interrupt occurs while the task is executing in virtual 
8086 mode is given by the following sequence. 


(1) Save the FLAGS register in a temp to push later. 
Turn off the VM and TF bits, and if the interrupt is 
serviced by an Interrupt Gate, turn off IF also. 


(2) Interrupt and Trap gates must perform a level 
switch from 3 (where the VM86 program exe- 
cutes) to level 0 (so IRET can return). This pro- 
cess involves a stack switch to the stack given in 
the TSS for privilege level 0. Save the Virtual 
8086 Mode SS and ESP registers to push in a 
later step. The segment register load of SS will 
be done as a Protected Mode segment load, 
since the VM bit was turned off above. 


(3) Push the 8086 segment register values onto the 
new stack, in the order: GS, FS, DS, ES. These 
are pushed as 32-bit quantities, with undefined 
values in the upper 16 bits. Then load these 4 
registers with null selectors (0). 
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Figure 4-25. Virtual 8086 Environment Interrupt and Call Handling 


(4) Push the old 8086 stack pointer onto the new 
stack by pushing the SS register (as 32-bits, high 
bits undefined), then pushing the 32-bit ESP reg- 
ister saved above. 


(5) Push the 32-bit FLAGS register saved in step 1. 


(6) Push the old 8086 instruction pointer onto the 
new stack by pushing the CS register (as 32-bits, 
high bits undefined), then pushing the 32-bit EIP 
register. 


(7) Load up the new CS:EIP value from the interrupt 
gate, and begin execution of the interrupt routine 
in protected 386 DX mode. 


The transition out of virtual 8086 mode performs a 
level change and stack switch, in addition to chang- 
ing back to protected mode. In addition, all of the 
8086 segment register images are stored on the 
stack (behind the SS:ESP image), and then loaded 
with null (0) selectors before entering the interrupt 
handler. This will permit the handler to safely save 
and restore the DS, ES, FS, and GS registers as 
80286 selectors. This is needed so that interrupt 
handlers which don’t care about the mode of the 
interrupted program can use the same prolog and 
epilog code for state saving (i.e. push all registers in 
prolog, pop all in epilog) regardless of whether or not 
a ‘native’ mode or Virtual 8086 mode program was 
interrupted. Restoring null selectors to these regis- 
ters before executing the IRET will not cause a trap 
in the interrupt handler. Interrupt routines which ex- 
pect values in the segment registers, or return val- 
ues in segment registers will have to obtain/return 
values from the 8086 register images pushed onto 


the new stack. They will need to know the mode of 
the interrupted program in order to know where to 


find/return segment registers, and also to know how 


to interpret segment register values. 


The IRET instruction will perform the inverse of the 
above sequence. Only the extended 386 DXs IRET 
instruction (operand size=32) can be used, and 
must be executed at level 0 to change the VM bit to 
ie 


(1) If the NT bit in the FLAGs register is on, an inter- 
task return is performed. The current’ state is 
stored in the current TSS, and the link field in the 
current TSS is used to locate the TSS for the 
interrupted task which is to be resumed. 


Otherwise, continue with the following sequence. 


(2) Read the FLAGS image from SS:8[ESP] into the 
FLAGS register. This will set VM to the value ac- 
tive in the interrupted routine. 


(3) Pop off the instruction pointer CS:EIP. EIP is 
popped first, then a 32-bit word is popped which 
contains the CS value in the lower 16 bits. If 
VM=0, this CS load is done as a protected 
mode segment load. If VM=1, this will be done 
as an 8086 segment load. 

(4) Increment the ESP register by 4 to bypass the 
FLAGS image which was “popped” in step 1. 

(5) If VM=1, load segment registers ES, DS, FS, 
and GS from memory locations SS:[ESP + 8], 
SS:[ESP + 12], SS:[ESP + 16], _ and 
SS:[ESP + 20], respectively, where the new val- 


\ 
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ue of ESP stored in step 4 is used. Since VM= 1, 


these are done as 8086 segment register loads. 


Else if VM=0, check that the selectors in ES, 


DS, FS, and GS are valid in the interrupted rou- 
tine. Null out invalid selectors to trap if an at- 
tempt is made to access through them. 


(6) If (RPL(CS) > CPL), pop the stack pointer 
-SS:ESP from the stack. The ESP register is 
popped first, followed by 32-bits containing SS in 
_the lower 16 bits. If VM=0, SS is loaded as a 
protected mode segment register load. If VM= 1, 
an 8086 segment register load is used. 


(7) Resume execution of the interrupted routine. The 
VM bit in the FLAGS register (restored from the 
interrupt routine’s stack image in step 1) deter- 


mines whether the processor resumes the inter-- 


rupted routine in Protected mode of Virtual 8086 
mode. 


5. FUNCTIONAL DATA 
5.1 INTRODUCTION 


The 386 DX features a straightforward functional in- 
terface to the external hardware. The 386 DX has 
separate, parallel buses for data and address. The 
data bus is 32-bits in width, and bidirectional. The 
address bus outputs 32-bit address values in the 
most directly usable form for the high-speed local 
bus: 4 individual byte enable signals, and the 30 up- 
per-order bits as a binary value. The data and ad- 
dress buses are interpreted and controlled with their 
associated control signals. 


A dynamic data bus sizing feature allows the proc- 
essor to handle a mix of 32- and 16-bit external bus- 
es ona cycle-by-cycle basis (see 5.3.4 Data Bus 
Sizing). If 16-bit bus size is selected, the 386 DX 
automatically makes any adjustment needed, even 


_ performing another 16-bit bus cycle to complete the 


transfer if that is necessary. 8-bit peripheral devices 
may be connected to 32-bit or 16-bit buses with no 
loss of performance. A new address pipelining op- 
tion is provided and applies to 32-bit and 16-bit bus- 
es for substantially improved memory utilization, es- 
pecially for the most heavily used memory resourc- 
es. 


The address pipelining option, when selected, typ- 
ically allows a given memory interface to operate 


with one less wait state than would otherwise be 
required (see 5.4.2 Address Pipelining). The pipe- 
lined bus is also well suited to interleaved memory 
designs. When address pipelining is requested by 
the external hardware, the 386 DX will output the 
address and bus cycle definition of the next bus cy- 
cle (if it is internally available) even while waiting for 
the current cycle to be acknowledged. 


386™ DX MICROPROCESSOR 


Non-pipelined address timing, however, is. ideal for 
external cache designs, since the cache memory will 
typically be fast enough to allow non-pipelined cy- 
cles. For maximum design flexibility, the address 
pipelining option is selectable on a cycle-by-cycle 
basis. 


The processor’s bus cycle is the basic mechanism 
for information transfer, either from system to proc- 
essor, or from processor to system. 386 DX bus cy- 
cles perform data transfer in a minimum of only two 


clock periods. On a 32-bit data bus, the maximum 


386 DX transfer bandwidth at 20 MHz is therefore 
40 MbBytes/sec, at 25 MHz _ bandwidth, is 
50 Mbytes/sec, and at 33 MHz bandwidth, is 
66 Mbytes/sec. Any bus cycle will be extended for 
more than two clock periods, however, if external 
hardware withholds acknowledgement of the cycle. 
At the appropriate time, acknowledgement is sig- 
nalled by asserting the 386 DX READY # input. 


The 386 DX can relinquish control of its local buses 
to allow mastership by other devices, such as direct 
memory access channels. When relinquished, HLDA 
is the only output pin driven by the 386 DX providing 
near-complete isolation of the processor from its 
system. The near-complete isolation characteristic is 
ideal when driving the system from test equipment, 
and in fault-tolerant applications. 


Functional data covered in this chapter describes 


the processor’s hardware interface. First, the set of 
signals available at the processor pins is described 
(see 5.2 Signal Description). Following that are the 
signal waveforms occurring during bus cycles (see 
5.3 Bus Transfer Mechanism, 5.4 Bus Functional 
Description and 5.5 Other Functional Descrip- 
tions). . 


5.2 SIGNAL DESCRIPTION 


5.2.1 Introduction 


Ahead is a brief description of the 386 DX input and 
output signals arranged by functional groups. Note 
the # symbol at the end of a signal name indicates 
the active, or asserted, state occurs when the signal 
is at a low voltage. When no # is present after the 
signal name, the signal is asserted when at the high 
voltage level. — — 


Example signal: M/IO# — High voltage indicates 


Memory selected 
— Low voltage indicates 
1/O selected 
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Figure 5-1. Functional Signal Groups 
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The signal descriptions sometimes refer to AC tim- 
ing parameters, such as “‘tos Reset Setup Time” and 
“tog Reset Hold Time.” The values of these parame- 
ters can be found in Tables 7-4 and 7-5. 


5.2.2 Clock (CLK2) 


CLK2 provides the fundamental timing for the 386 
DX. It is divided by two internally to generate the 
internal processor clock used for instruction execu- 
tion. The internal clock is comprised of two phases, 
“phase one” and “phase two.” Each CLK2 period is 
a phase of the internal clock. Figure 5-2 illustrates 
the relationship. If desired, the phase of the internal 
processor clock can be synchronized to a known 
phase by ensuring the RESET signal falling edge 
meets its applicable setup and hold times, tos and 
tog. 


5.2.3 Data Bus (DO through D31) 


These three-state bidirectional signals provide the 
general purpose data path between the 386 DX and 


Figure 5-2. CLK2 Signal and Internal Processor Clock 
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other devices. Data bus inputs and outputs indicate 
“1” when HIGH. The data bus can transfer data on 
32- and 16-bit buses using a data bus sizing feature 
controlled by the BS16# input. See section 5.2.6 
Bus Contol. Data bus reads require that read data 
setup and hold times to, and tos be met for correct 
operation. In addition, the 386 DX requires that all 
data bus pins be at a valid logic state (high or low) at 
the end of each read cycle, when READY # is as- 
serted. During any write operation. (and during halt 
cycles and shutdown cycles), the 386 DX always 
drives all 32 signals of the data bus even if the cur- 
rent bus size is 16-bits. 


5.2.4 Address Bus (BEO# through 
BE3 #, A2 through A31) 


These three-state outputs provide physical memory 
addresses or I/O port addresses. The address bus 
is capable of addressing 4 gigabytes of physical 
memory space (OQOOQQ000H through FFFFFFFFH), 
and 64 kilobytes of |/O address space (QOOQQOQ00H 
through OOOOFFFFH) for programmed 1/O. 1/O 
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transfers automatically generated for 386 DX-to-co- 
processor communication use |/O addresses 
— 800000F8H through 800000FFH, so A31 HIGH in 
conjunction with M/IO# LOW allows simple genera- 
tion of the coprocessor select signal. 


The Byte Enable outputs, BEO# -BE3¥#, directly in- 


dicate which bytes of the 32-bit data bus are in- 


volved with the current transfer. This is most conve- _ | 


nient for external hardware. 


BEO# applies to DO-D7 
BE1# applies to D8-D15 
BE2# applies to Di6-D23 
BE3# applies to D24-D31 


The number of Byte Enables asserted indicates the 
physical size of the operand being transferred (1, 2, 
3, or 4 bytes). Refer to section 5.3.6 Operand Align- 
ment. | 


When a memory write cycle or I/O write cycle is in 
progress, and the operand being transferred occu- 


pies only the upper 16 bits of the data bus (D16— 


D31), duplicate data is simultaneously presented on 
the corresponding lower 16-bits of the data bus 
(DO-—D15). This duplication is performed for optimum 
write performance on 16-bit buses. The pattern of 
write data duplication is a function of the Byte En- 
ables asserted during the write cycle. Table 5-1 lists 
the write data present on DO-D31, as a function of 


_ the asserted Byte Enable outputs BEO#-BE3#. 


386™ DX Byte Enables 


BE3 # BE2# BE1# =BEO# | D24-D31 


undef 
undef 


undef 
D 


undef 

undef 
D 

undef 


D 


D 


_ D = logical write data d24—d31 

_C = logical write data d16-d23 
B = logical write data d8-d15 
A = logical write data d0-d7 
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5.2.5 Bus Cycle Definition Signals 
" (W/R#, D/C#, M/IO#, LOCK #) 


These three-state outputs define the type of bus cy- 
cle being performed. W/R# distinguishes between 
write and read cycles. D/C# distinguishes between 
data and control cycles. M/IO# distinguishes be- 
tween memory and I/O cycles. LOCK# distin- 
guishes between locked and unlocked bus cycles. 


The primary bus cycle definition signals are W/R#, 
D/C# and M/IO#, since these are the signals driv- 
en valid as the ADS# (Address Status output) is 
driven asserted. The LOCK# is driven valid at the 
same time as the first locked bus cycle begins, 
which due to address pipelining, could be later than 
ADS # is driven asserted. See 5.4.3.4 Pipelined Ad- 
dress. The LOCK# is negated when the READY # 
input terminates the last bus cycle which was 


locked. 


Exact bus cycle definitions, as a function of W/R#, 
D/C#, and M/IO#, are given in Table 5-2. Note one 
combination of W/R#, D/C# and M/IO# is never 
given when ADS # is asserted (however, that combi- 
nation, which is listed as “does not occur,” may oc- 
cur during idle bus states when ADS # is not assert- 
ed). If M/IO#, D/C#, and W/R# are qualified by 


-ADS# asserted, then a decoding scheme may be 


simplified by using this definition of the “does not 
occur” combination. 


Table 5-1. Write Data Duplication as a Function of BEO#-BE3# 


-- 886T™ DX Write Data 


D16-D23 


Automatic 
Duplication? 


~D8-D15 DO-D7 
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Table 5-2. Bus Cycle Definition 


[tow [tow | tow | INTERRUPT ACKNOWLEDGE ‘| Yes 
[High [does notoccur —SSCSC=~isSCi 
[tow | VODATAREAD—SSSSC~*dSCSSCN 
[tow | High [High | vODATAWRE ——SSSS~S*dCSC‘N 


(BEO# High 
BE1# High 
BE2# Low 
BE3# High 
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Address = 2 Address = 0 


SHUTDOWN: 


(BEO# Low 
BE1# High 
BE2# High 
BE3# High 


5.2.6 Bus Control Signals (ADS#, 
READY #, NA#, BS16#) 


5.2.6.1 INTRODUCTION 


The following signals allow the processor to indicate 
when a bus cycle has begun, and allow other system 
hardware to control address pipelining, data bus 
width and bus cycle termination. 


5.2.6.2 ADDRESS STATUS (ADS#) 


This three-state output indicates that a valid bus cy- 
cle definition, and address (W/R#, D/C#, M/IO#, 
BEO#-BE3#, and A2-A31) is being driven at the 
386 DX pins. It is asserted during T1 and T2P bus 
states (see 5.4.3.2 Non-pipelined Address and 
5.4.3.4 Pipelined Address for additional information 
on bus states). : 


5.2.6.3 TRANSFER ACKNOWLEDGE (READY #) 


This input indicates the current bus cycle is com- 
plete, and the active bytes indicated by BEO#- 
BES# and BS16# are accepted or provided. When 
READY # is sampled asserted during a read cycle or 
interrupt acknowledge cycle, the 386 DX latches the 
input data and terminates the cycle. When READY # 
is sampled asserted during a write cycle, the proces- 
sor terminates the bus cycle. 


READY # is ignored on the first bus state of all bus 
cycles, and sampled each bus state thereafter until 
asserted. READY # must eventually be asserted to 
acknowledge every bus cycle, including Halt Indica- 
tion and Shutdown Indication bus cycles. When be- 
ing sampled, READY must always meet setup and 


Yes | 

No 

No 
A2-A31 Low) 
Low | MEMORY DATA READ Some Cycles | — 
MEMORY DATA WRITE Some Cycles 


A2-A31 Low) 


hold times tyg and tao for correct operation. See all 
sections of 5.4 Bus Functional Description. 


5.2.6.4 NEXT ADDRESS REQUEST (NA#) 


This is used to request address pipelining. This input 
indicates the system is prepared to accept new val- 
ues of BEO#-BE3#, A2-A31, W/R#, D/C# and 
M/lO# from the 386 DX even if the end of the cur- 
rent cycle is not being acknowledged on READY #. 
If this input is asserted when sampled, the next ad- 
dress is driven onto the bus, provided the next bus 


- request is already pending internally. See 5.4.2 Ad- 


dress Pipelining and 5.4.3 Read and Write 
Cycles. NA# must always meet setup and hold 
times, ty5 and tyg, for correct operation. 


5.2.6.5 BUS SIZE 16 (BS16#) 


The BS16# feature allows the 386 DX to directly 
connect to 32-bit and 16-bit data buses. Asserting 
this input constrains the current bus cycle to use 
only the lower-order half (D0-D15) of the data bus, 
corresponding to BEO# and BE1#. Asserting 
BS16# has no additional effect if only BEO# and/or 
BE1# are asserted in the current cycle. However, 
during bus cycles asserting BE2# or BES#, assert- 
ing BS16# will automatically cause the 386 DX to 
make adjustments for correct transfer of the upper 
bytes(s) using only physical data signals DO-D15. 


If the operand spans both halves of the data bus 
and BS16# is asserted, the 386 DX will automatical- 
ly perform another 16-bit bus cycle. BS16# must 
always meet setup and hold times t;7 and tig for 
correct operation. 
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386 DX 1/0 cycles are automatically generated for 


coprocessor communication. Since the 386 DX must 
transfer 32-bit quantities between itself and the 387 


DX, BS16# must not be asserted during 387 DX 


communication cycles. 


5.2.7 Bus Arbitration Signals 
(HOLD, HLDA) £ 


5.2.7.1 INTRODUCTION 


This section describes the mechanism by which the 

processor relinquishes control of its local buses 

when requested by another bus master device. See 

5.5.1 Entering and Exiting Hold Acknowledge for 
additional information. 


5.2.7.2 BUS HOLD REQUEST (HOLD) 


This input indicates some device other than the 386 = 


DX requires bus mastership. 


HOLD must remain asserted as long as any other 
device is a local bus master. HOLD is not recognized 
while RESET is asserted. If RESET is asserted while 


HOLD is asserted, RESET has priority and places 


the bus into an idle state, rather than the hold ac- 
knowledge (high impedance) state. 


HOLD is level-sensitive and is a synchronous input. 
_ HOLD signals must always meet setup and hold 
times tog and t24 for correct operation. 


5.2.7.3 BUS HOLD ACKNOWLEDGE (HLDA) | | 
Assertion of this output indicates the 386 DX has 
relinquished control of its local bus in response to 


HOLD asserted, and is in the bus Hold Acknowledge 
state. | 


The Hold Acknowledge state offers near-complete 


signal isolation. In the Hold Acknowledge state, 


HLDA is the only signal being driven by the 386 DX. 
The other output signals or bidirectional signals 
(DO-D31, BEO#-—BE3#, A2-A31, W/R#, D/C#, 
M/lIO#, LOCK# and ADS#) are in a high-imped- 
ance state so the requesting bus master may control 
them. Pullup resistors may be desired on several sig- 
nals to avoid spurious activity when no bus master is 
driving them. See 7.2.3 Resistor Recommenda- 
tions. Also, one rising edge occuring on the NMI 
input during Hold Acknowledge is remembered, for 
processing after the HOLD input is negated. 


In addition to the normal usage of Hold Acknowl- 
edge with DMA controllers or master peripherals, 
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the near-complete isolation has particular attractive- 
ness during system test when test equipment drives 
the system, and in hardware-fault- tolerant applica- 
tions. : 


5.2.8 Coprocessor Interface Signals 
(PEREQ, BUSY # , ERROR #) 


5.2.8.1 INTRODUCTION 


- In the following sections are descriptions of signals — 


dedicated to the numeric coprocessor interface. In 
addition to the data bus, address bus, and bus cycle 
definition signals, these following signals control 
communication between the 386 DX and its 387 DX 
processor extension. 


5.2.8.2 COPROCESSOR REQUEST (PEREQ) 


When asserted, this input signal indicates a coproc- 
essor request for a data operand to be transferred 
to/from memory by the 386 DX. In response, the 
386 DX transfers information between the coproces- 
sor and memory. Because the 386 DX has internally 
stored the coprocessor opcode being executed, it 
performs the requested data transfer with the cor- 
rect direction and menoy address. 


PEREQ i is level- sensitive and is allowed to be asyn- 
chronous to the CLK2 signal. 


5.2.8.3 COPROCESSOR BUSY (BUSY #) 


When asserted, this input indicates the coprocessor 
is still executing an instruction, and is not yet able to 


accept another. When the 386 DX encounters any 
_ coprocessor instruction which operates on the nu- 


meric. stack (e.g. load, pop, or arithmetic operation), 
or the WAIT instruction, this input is first automatical- 
ly sampled until it is seen to be negated. This sam- 
pling of the BUSY # input prevents overrunning the 
execution of a previous coprocessor instruction. 


The FNINIT and FNCLEX coprocessor instructions 
are allowed to execute even if BUSY # is asserted, 
since these instructions are used for coprocessor 
initialization and exception-clearing. 


BUSY # is.level-sensitive and is allowed to be asyn- 
chronous to the CLK2 signal. 


BUSY # serves an additional function. If BUSY # is 


~ sampled LOW at the falling edge of RESET, the 386 


DX performs an internal self-test (see 5.5.3 Bus Ac- 
tivity During and Following Reset). If BUSY # is 
sampled HIGH, no self-test is performed. 
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5.2.8.4 COPROCESSOR ERROR (ERROR #) 


This input signal indicates that the previous coproc- 
essor instruction generated a coprocessor error of a 
type not masked by the coprocessor’s control regis- 
ter. This input is automatically sampled by the 386 
DX when a coprocessor instruction is encountered, 
and if asserted, the 386 DX generates exception 16 
to access the error-handling software. 


Several coprocessor instructions, generally those 

which clear the numeric error flags in the coproces- 

sor or save coprocessor state, do execute without 

the 386 DX generating exception 16 even if ER- 

ROR # is asserted. These instructions are FNINIT, 

FNCLEX, FSTSW, FSTSWAX, FSTCW, FSTENV, 
FSAVE, FESTENV and FESAVE. 


ERROR #¥ is level-sensitive and is allowed to be 
asynchronous to the CLK2 signal. 


5.2.9 Interrupt Signals (INTR, NMI, 
RESET) 


5.2.9.1 INTRODUCTION 


The following descriptions cover inputs that can in- 
terrupt or suspend execution of the processor’s cur- 
rent instruction stream. 


5.2.9.2 MASKABLE INTERRUPT REQUEST (INTR) 


When asserted, this input indicates a request for in- 
terrupt service, which can be masked by the 386 DX 
Flag Register IF bit. When the 386 DX responds to 
the INTR input, it performs two interrupt acknowl- 
edge bus cycles, and at the end of the second, 
latches an 8-bit interrupt vector on DO-—D7 to identify 
the source of the interrupt. | 


INTR is level-sensitive and is allowed to be asyn- 
chronous to the CLK2 signal. To assure recognition 
of an INTR request, INTR should remain asserted 
until the first interrupt acknowledge bus cycle be- 
gins. - 


5.2.9.3 NON-MASKABLE INTERRUPT REQUEST 
(NMI) 


This input indicates a request for interrupt service, 
which cannot be masked by software. The non- 
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maskable interrupt request is always processed ac- 
cording to the pointer or gate in slot 2 of the interrupt 
table. Because of the fixed NMI slot assignment, no 
interrupt acknowledge cycles are perfomed when 
processing NMI. 


NMI is rising edge-sensitive and is allowed to be 
asynchronous to the CLK2 signal. To assure recog- 
nition of NMI, it must be negated for at least eight 
CLK2 periods, and then be asserted for at least 
eight CLK2 periods. 


Once NMI processing has begun, no additional 
NMI’s are processed until after the next IRET in- 
struction, which is typically the end of the NMI serv- 
ice routine. If NMI is re-asserted prior to that time, 
however, one rising edge on NMI will be remem- 
bered for processing after executing the next IRET 
instruction. 


5.2.9.4 RESET (RESET) 


This input signal suspends any operation in progress 
and places the 386 DX in a known reset state. The 
386 DX is reset by asserting RESET for 15 or more 
CLK2 periods (80 or more CLK2 periods before re- 


questing self test). When RESET is asserted, all oth- 


er input pins are ignored, and all other bus pins are 
driven to an idle bus state as shown in Table 5-3. If 
RESET and HOLD are both asserted at a point in 
time, RESET takes priority even if the 386 DX was in 
a Hold Acknowledge state prior to RESET asserted. 


RESET is level-sensitive and must be synchronous 
to the CLK2 signal. If desired, the phase of the inter- 
nal processor clock, and the entire 386 DX state can 
be completely synchronized to external circuitry by 
ensuring the RESET signal falling edge meets its ap- 
plicable setup and hold times, tas and tog. 


Table 5-3. Pin State (Bus Idle) During Reset 


Signal Level During Reset 


ADS # High 
DO-D31 High Impedance 


BEO#-BE3# Low 


A2-A31 High 
W/R# Low 
D/C# High 
M/IO# | Low 
LOCK # ~ High 
HLDA Low 


5.2.10 Signal Summary 
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Table 5-4 summarizes the characteristics of all 386 DX signals. | 
| Table 5-4. 386™ DX Signal Summary 


Signal Function 


Signal Name 


ERROR# | CoprocessorEror 
PINTR | Mashable ntorupt Request 
[nw | Nor Maske intpt Request 
Preset [Rest 


5.3 BUS TRANSFER MECHANISM 


5.3.1 Introduction 


All data transfers occur as a result of one or more © 


bus cycles. Logical data operands of byte, word and 
double-word lengths may be transferred without re- 
strictions on physical address alignment. Any byte 
boundary may be used, although two or even three 
physical bus cycles are performed as required for 
unaligned operand transfers. See 5.3.4 Dynamic 
Data Bus Sizing and 5.3.6 Operand Alignment. 


Active 
State 


Low 
Low 
High 
High 
High 
Low 


~ Low: 
High 
High | 
High | 


Input : 
Input/ Synchor | ,,. Sulput 
_ | High Impedance 
Output Asynch — During HLDA? — 
to CLK2 | g : 


a ee 
[ae0#-069% | Byecnabies ———~«| _tow | o | — | ves 
raza | AddessGus——~—=S«|~High “|| | — *(| ves 
Fw | Wite-Readindication | High | 0 | — | Yes 
Po/c# | DataContrl indication | High | o | — | Yes 
Bus Lock Indication | tw | oF | —- | , Yes 

TNAt | NextAdcress Request| tw | 1 | 8s | — 


The 386 DX address signals are designed to simplify 
external system hardware. Higher-order address bits 
are provided by A2—A31. Lower-order address in the 
form of BEO #-BE3# directly provides linear selects 
for the four bytes of the 32-bit data bus. Physical 
operand size information is thereby implicitly provid- 
ed each bus cycle in the most usable form. 


Byte Enable outputs BEO#-BE3# are asserted 
when their associated data bus bytes are involved 
with the present, bus cycle, as listed in Table 5-5. 
During a bus cycle, any possible pattern of contigu- 
ous, asserted Byte Enable outputs can occur, but 
never patterns having a negated Byte Enable sepa- 
rating two or three asserted Enables. 
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Address bits AO and A1 of the physical operand’s 
base address can be created when necessary (for 
instance, for MULTIBUS® | or MULTIBUS® II inter- 
face), as a function of the lowest-order asserted 
Byte Enable. This is shown by Table 5-6. Logic to 
generate AO and A1 is given by Figure 5-3. 


Table 5-5. Byte Enables and Associated 
Data and Operand Bytes 


[ese [oee-051 too snostsptean 
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Table 5-6. Generating AO—A31 from 
BEO #-BE3# and A2-A31 


386™ DX Address Signals 


A31 


Physical Base 
Address 
31 


EO# 


MOL | 


BE1# 
K - Map for A1 Signal 


K - Map for AO Signal 


231630-3 


231630~—4 


Figure 5-3. Logic to Generate AO, A1 from BE0O # -BE3 # 


Each bus cycle is composed of at least two bus 
states. Each bus state requires one processor clock 
period. Additional bus states added to a single bus 
cycle are called wait states. See 5.4 Bus Functional 
Description. 


Since a bus cycle requires a minimum of two bus 
states (equal to two processor clock periods), data 
can be transferred between external devices and 
the 386 DX at a maximum rate of one 4-byte Dword 
every two processor clock periods, for a maximum 
bus bandwidth of 66 megabytes/second (386 DX 
operating at 33 MHz processor clock rate). 


5.3.2 Memory and I/O Spaces" 


Bus cycles may access physical memory space or 
|/O space. Peripheral devices in the system may ei- 
ther be memory-mapped, or |/O-mapped, or both. 
As shown in Figure 5-4, physical memory addresses 
range from OOOO0000H to FFFFFFFFH (4 gigabytes) 
and I/O addresses from OOQOQ000H to OOOOFFFFH 
(64 kilobytes) for programmed I/O. Note the I/O ad- 
dresses used by the automatic I/O cycles for co- 
processor communication are 8Q0000F8H to 
800000FFH, beyond the address range of pro- 
grammed I/O, to allow easy generation of a coproc- 
essor chip select signal using the A31 and M/IO# 
signals. 
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FFFFFFFFH 


PHYSICAL 
MEMORY 


4 GBYTE 


(NOTE 1) 
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800000FFH 
800000F8H L__———} ~ COPROCESSOR 
e 


(387™ px) 


OOOOFFFFH 


00000000H 
Physical Memory Space 
NOTE: 


ACCESSIBLE 
64 kBYTE PROGRAMMED 
00000000H 1/0 SPACE 


231630-5 
1/O Space 


Since A31 is HIGH during automatic communication with coprocessor, A31 HIGH and M/IO# LOW can be used to 


easily generate a coprocessor select signal. 


Figure 5-4. Physical Memory and I/O Spaces 


_ 5.3.3 Memory and I/O Organization 


The 386 DX datapath to memory and |/O spaces 
can be 32 bits wide or 16 bits wide. When 32-bits 
wide, memory and I/O spaces are organized natural- 
ly as arrays of physical 32-bit Dwords. Each memory 
or |/O Dword has four individually addressable bytes 
at consecutive byte addresses. The lowest-ad- 
dressed byte is associated with data signals DO—D7; 
the highest-addressed byte with D24—D31. 


The 386 DX includes a bus control input, BS16#, 
that also allows direct connection to 16-bit memory 
or !/O spaces organized as a sequence of 16-bit 
words. Cycles to 32-bit and 16-bit memory or I/O 
devices may occur in any sequence, since the 
_ BS16# control is sampled during each bus cycle. 
See 5.3.4 Dynamic Data Bus Sizing. The Byte En- 
able signals, BEO#-—BE3#, allow byte granularity 
when addressing any memory or |/O structure, 
whether 32 or 16 bits wide. 


5.3.4 Dynamic Data Bus Sizing 


Dynamic data bus sizing is a feature allowing direct 
processor connection to 32-bit or 16-bit data buses 
for memory or I/O. A single processor may connect 
to both size buses. Transfers to or from 32- or 16-bit 
ports are supported by dynamically determining the 
bus width during each bus cycle. During each bus 
cycle an address decoding circuit or the slave de- 


vice itself may assert BS16# for 16-bit ports, or ne- 


— gate BS16# for 32-bit ports. 


With BS16# asserted, the processor automatically 
converts operand transfers larger than 16 bits, or 
misaligned 16-bit transfers, into two or three trans- 
fers as required. All operand transfers physically oc- 
cur on DO—D15 when BS16# is asserted. There- 
fore, 16-bit memories or |/O devices only connect 
on data signals DO-D15. No extra transceivers are 
required. 


Asserting BS16# only affects the processor when 
BE2# and/or BE3# are asserted during the current 
cycle. If only DO-D15 are involved with the transfer, 
asserting BS16# has no affect since the transfer 
can proceed normally over a 16-bit bus whether 
BS16# is asserted or not. In other words, asserting 
BS16# has no effect when only the lower half of the 
bus is involved with the current cycle. 


There are two types of situations where the proces- 
sor is affected by asserting BS16#, depending on | 
which Byte Enables are asserted during the current 
bus cycle: 


Upper Half Only: 
Only BE2# and/or BE3# asserted. 


Upper and Lower Half: 
At least BE1#, BE2# asserted (and perhaps 
also BEO# and/or BE3#). 
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Effect of asserting BS16# during ‘upper half only” 

read cycles: 
Asserting BS16# during “upper half only” reads 
causes the 386 DX to read data on the lower 16 
bits of the data bus and ignore data on the upper 
16 bits of the data bus. Data that would have been 
read from D16-—D31 (as indicated by BE2# and 
BE3#) will instead be read from DO-—D15 respec- 
tively. 


Effect of asserting BS16# during “upper half only” 

write cycles: 
Asserting BS16# during “upper half only” writes 
does not affect the 386 DX. When only BE2# 
and/or BE3# are asserted during a write cycle 
the 386 DX always duplicates data signals 
D16-—D31 onto DO-D15 (see Table 5-1). There- 
fore, no further 386 DX action is required to per- 
form these writes on 32-bit or 16-bit buses. 


Effect of asserting BS16# during “upper and lower 

half’ read cycles: 
Asserting BS16# during ‘upper and lower half” 
reads causes the processor to perform two 16-bit 
read cycles for complete physical operand trans- 
fer. Bytes 0 and 1 (as indicated by BEO# and 
BE1#) are read on the first cycle using DO-D15. 
Bytes 2 and 3 (as indicated by BE2# and BE3#) 
are read during the second cycle, again using 
DO-D15. D16-—D31 are ignored during both 16-bit 
cycles. BEO# and BE1# are always negated dur- 
ing the second 16-bit cycle (See Figure 5-14, cy- 
cles 2 and 2a). 


32. DATA BUS (D0-D31) 


ADDRESS BUS (BEO#—BE3#,A2—A31) 


“HIGH” 


386™ DX MICROPROCESSOR 


Effect of asserting BS16# during ‘upper and lower 

half” write cycles: 
Asserting BS16# during “upper and lower half’ 
writes causes the 386 DX to perform two 16-bit 
write cycles for complete physical operand trans- 
fer. All bytes are available the first write cycle al- 
lowing external hardware to receive Bytes 0 and 1 
(as indicated by BEO# and BE1 #) using DO-D15. 
On the second cycle the 386 DX duplicates Bytes 
2 and 3 on DO-—D15 and Bytes 2 and 3 (as indicat- 
ed by BE2# and BE3#) are written using DO- 
D15. BEO# and BE1# are always negated during 
the second 16-bit cycle. BS16# must be asserted 
during the second 16-bit cycle. See Figure 5-14, 
cycles 1 and 1a. 


5.3.5 Interfacing with 32- and 16-Bit 
Memories 


In 32-bit-wide physical memories such as Figure 5-5, 
each physical Dword begins at a byte address that is 
a multiple of 4. A2—A31 are directly used as a Dword 
select and BEO#-BE3¥# as byte selects. BS16# is 
negated for all bus cycles involving the 32-bit array. 


When 16-bit-wide physical arrays are included in the 
system, as in Figure 5-6, each 16-bit physical word 


‘begins at a address that is a multiple of 2. Note the 


address is decoded, to assert BS16# only during 
bus cycles involving the 16-bit array. (If desiring to 


32=BlT 
MEMORY 


231630-6 


Figure 5-5. 386™ DX with 32-Bit Memory 


32 DATA BUS (D0-D31) 


386™ px] 32=BiT 
CPU | ADDRESS BUS | MEMORY 


(BEO#—-BE3#, A2—A31) 


ADDRESS 
DECODER 


DATA BUS (D0=D15) 


ADDRESS BUS (A2—A31) 16=BIT 

ae Sea MEMORY 
BEO#-BE3 BHE#, BLE#, At 
(BEO# ull pees | #, BLE#,A1) 


231630--7 


Figure 5-6. 386™ DX with 32-Bit and 16-Bit Memory 
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use pipelined address with 16-bit memories then 
BEO#-BE3# and W/R# are also decoded to de- 
termine when BS16# should be asserted. See 
5.4.3.6 Pipelined Address with. mynemic Data Bus 
Sizing.). 


A2-A31 are directly usable for ene 32-bit 
and 16-bit devices. To address 16-bit devices, A1 
and two byte enable signals are also needed. 


To generate an A1 signal and two Byte Enable sig- 
nals for 16-bit access, BEO# -BE3# should be de- 
coded as in Table 5-7. Note certain combinations of 
BEO#-BE3# are never generated by the 386 DX, 


leading to ‘‘don’t care” conditions in the decoder. | 


Any BEO#-—BE3# decoder, such as Figure 5-7, may 
use the non-occurring BEO#-BE3# combinations 
to its best advantage. 3 | 


5.3.6 Operand Alignment 


With the flexibility of memory addressing on the 386 
DX, it is possible to transfer a logical operand that 
spans more than one physical Dword or word of 
memory or I/O. Examples are 32-bit Dwordoperands 
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beginning at addresses not evenly divisible by 4, ora _. 
16-bit word operand split between ayo ee 
Dwords of the memory array. 


Operand alignment and data bus size dictate when 
multiple bus cycles are required. Table 5-8 describes 
the transfer cycles generated for all combinations of 


logical operand lengths, alignment, and data bus siz- | 


ing. When multiple bus cycles are required to trans- 
fer a multi-byte logical operand, the highest-order 
bytes are transferred first (but if BS16# asserted 
requires two 16-bit cycles be peveniee nat part of 
the transfer is low-order first). 


5.4 BUS FUNCTIONAL DESCRIPTION 


5. 4.1 Introduction 


The 386 DX has separate, parallel buses for data 
and address. The data bus is 32-bits in width, and 
bidirectional. The address bus provides a 32-bit val- 
ue using 30 signals for the 30 upper-order address 
bits. and 4 Byte Enable signals to directly indicate the 
active bytes. These buses are interpreted and con- 


trolled via several associated definition or control 


signals. — 


_ Table 5-7. Generating A1, BHE# and BLE # for Addressing 16-Bit Devices 


3867™ DX Signals 


| _ teat orate [camera 


X 
L 
L 
L 
H 
Xx 
L 
L 
H 
x 
nae 
Xx 
Ho 
x 
L 
L 


BLE# asserted when DO-D7 of 16-bit bus is active. 
BHE# asserted when D8-D15 of 16-bit bus is active. | 
A1 low for all even words; A1 high for all odd words. 


Key: 
x = don’t care 
H = high voltage level. 
L = low voltage level | 


x—no active bytes 


x—not contiguous bytes 


x—not contiguous bytes 
x—not contiguous bytes 
x—not contiguous bytes 


x—not continguous bytes - 


moe xéM MOOK oe 
muéM eX x<X kK Or oEXrerooer.EX 


* = anon-occurring pattern of Byte Enables; either none are asserted, 
or the pattern has Byte Enables asserted for non-contiguous bytes 
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BE1# 


PAT psy acy | Yo—[>oBtt 


231630-9 


BLE# (OR AO) 


231630-10 
K-map for 16-bit BLE # signal (same as AO signal in Figure 5-3) 


Figure 5-7. Logic to Generate A1, BHE# and BLE# for 16-Bit Buses 


Table 5-8. Transfer Bus Cycles for Bytes, Words and Dwords 


Byte-Length of Logical Operand 


Physical Byte Address 
in Memory (low-order bits) 


Transfer Cycles over 
16-Bit Data Bus 


= byte transfer 3 =3-byte transfer 
word transfer , d = Dword transfer 
low-order portion ~ h = high-order portion 
mid-order portion | 
= don’t care 
= BS16# asserted causes second bus cycle 
*For this case, 8086, 8088, 80186, 80188, 80286 transfer Ib first, then hb. 
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The definition of each bus cycle is given by three 
— definition signals: M/IO#, W/R# and D/C#. At the 


same time, a valid address is present on the byte. 


enable signals BEO#-—BE3# and other address sig- 
nals A2-A31. A status signal, ADS#, indicates 
when the 386 DX issues a new bus cycle definition 
and address. ys 


Collectively, the address bus, data bus and all asso- 
ciated control signals are referred to simply as ‘the 
bus’. _ 


When active, the bus performs one of the bus cycles © | 


below: 

1) read from memory space 

2) locked read from memory space 

3) write to memory space 

4) locked write to memory space 

5) read from I/O space (or coprocessor) 
6) write to 1/O space (or coprocessor) 
7) interrupt acknowledge _ 

8) indicate halt, or indicate shutdown 


CYCLE 1 
NON=PIPELINED 


(READ) 


T1 T2 
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Table 5-2 shows the encoding of the bus cycle defi- 
nition signals for each bus cycle. See section 5.2.5 
Bus Cycle Definition. | 


The data bus has a dynamic sizing feature support- 
ing 32- and 16-bit bus size. Data bus size is indicated 
to the 386 DX using its Bus Size 16 (BS16#) input. 
All bus functions can be performed with either data 


_ bus size. 


When the 386 DX bus is not performing one of the 


activities listed above, it is either Idle or in the Hold 
Acknowledge state, which may be detected by ex- 
ternal circuitry. The idle state can be identified by the 


_ 386 DX giving no further assertions on its address 


strobe output (ADS#) since the beginning of its 
most recent bus cycle, and the most recent bus cy- 
cle has been terminated. The hold acknowledge 
state is identified by the 386 DX asserting its hold 
acknowledge (HLDA) output. 


The shortest time unit of bus activity is a bus state. A 
bus state is one processor clock period (two CLK2 
periods) in duration. A complete data transfer occurs 


_ during a bus cycle, composed of two or more bus 
states. 


CYCLE 3 
- NON=PIPELINED 
(READ) 


CYCLE 2 
NON=PIPELINED 
(READ) 


Te T2 T1 — 72 


$1 [62/61 [62/61 ]62 161 [92 ]o1 [92/01 [oz {or 


CLK2 
(INPUT) 


BEO#—BE3#, A2~A31, lek 


VALID 1 N 


wos 5 ad 


M/lO#, D/C#, W/R# 
. (OUTPUTS) 


(OUTPUT) 


NA¥ 
(INPUT) 


' READY# 
(INPUT) 


| 


LOCKF; sp} 
(ire L 


VALID 1 


-D0-D31 IN 
(INPUT DURING READ) 


Y 


Figure 5-8. Fastest Read Cycles with Non-Pipelined Address Timing 


Pt | 


VALID 2 VALID 3 " 


ALID 2 ) VALID 3 > 


Fastest non-pipelined bus cycles consist of T1 and T2 


231630-11 
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The fastest 386 DX bus cycle requires only two bus 
states. For example, three consecutive bus read cy- 
cles, each consisting of two bus states, are shown 
by Figure 5-8. The bus states in each cycle are 
named T1 and T2. Any memory or !/O address may 
be accessed by such a two-state bus cycle, if the 
external hardware is fast enough. The high-band- 
width, two-clock bus cycle realizes the full potential 
of fast main memory, or cache memory. 


Every bus cycle continues until it is acknowledged 
by the external system hardware, using the 386 DX 
READY # input. Acknowledging the bus cycle at the 
end of the first T2 results in the shortest bus cycle, 
requiring only T1 and T2. lf READY # is not immedi- 
ately asserted, however, T2 states are repeated in- 
definitely until the READY # input is sampled assert- 
ed. 


5.4.2 Address Pipelining 


The address pipelining option provides a choice of 
bus cycle timings. Pipelined or non-pipelined ad- 
dress timing is selectable on a cycle-by-cycle basis 
with the Next Address (NA #) input. 


CYCLE 1 
PIPELINED 
(READ) 


T1P T2P 
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When address pipelining is not selected, the current 
address and bus cycle definition remain stable 
throughout the bus cycle. 


When address pipelining is selected, the address 
(BEO#-BE3#, A2-A31) and definition (W/R#, 
D/C# and M/IO#) of the next cycle are available 
before the end of the current cycle. To signal their 
availability, the 386 DX address status output 
(ADS #) is also asserted. Figure 5-9 illustrates the 
fastest read cycles with pipelined address timing. 


Note from Figure 5-9 the fastest bus cycles using 
pipelined address require only two bus states, 
named T1P and T2P. Therefore cycles with pipe- 
lined address timing allow the same data bandwidth 
as non-pipelined cycles, but address-to-data access 
time is increased compared to that of a non-pipe- 
lined cycle. 


By increasing the address-to-data access time, pipe- 
lined address timing reduces wait state require- 
ments. For example, if one wait state is required with 
non-pipelined address timing, no wait states would 
be required with pipelined address. 


CYCLE 2 
PIPELINED PIPELINED 
(READ) (READ) 


T1P T2P TIP T2P 


CYCLE 3 


61/62/61 (62/61 |o2/41 [42/61 | 42/1 | 92 


CLK2 
_ ts bs lI a 
BEO#=BE3#,A2—A31, 


M/l0#, D/C#, W/R# i 
(OUTPUTS) 


NA¥ 
(INPUT) 


READY# 
(INPUT) 


LOCK# 
(OUTPUT) 


DO=D31 
(INPUT DURING READ) 


~ LA OL 
(OUTPUT) 


VALID1 WK VALID2 {XK VALID3 [K VALID 4 


( 
ie 


VALID 1 VALID 2 X VALID 3 x 


231630-12 


Fastest pipelined bus cycles consist of T1P and T2P 


Figure 5-9. Fastest Read Cycles with Pipelined Address Timing 
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Pipelined address timing is useful in typical systems 
having address latches. In those systems, once an 
address has been latched, pipelined availability of 
the next address allows decoding circuitry to gener- 
ate chip selects (and other necessary select signals) 
in advance, so selected devices are accessed im- 
mediately when the next cycle begins. In other 
words, the decode time for the next cycle can be 
overlapped with the end of the current cycle. 


lf a system contains a memory structure of two or — 


more interleaved memory banks, pipelined address 
timing potentially allows even more overlap of activi- 
ty. This is true when the interleaved memory control- 


ler is designed to allow the next memory operation | 


TWO-BANK INTERLEAVED MEMORY 
a) Address signal A2 selects bank 
b) 32-bit datapath to each bank 


386™ px 
CPU . 


INTERLEAVE | 
| CONTROLLER I= 


FOUR-BANK INTERLEAVED MEMORY 
_ a) Address signals A3 and A2 select bank 
b) 32-bit datapath to each bank | 


DATA BUS 


386 px Ff 


32 
CPU 


DATA BUS . 


Figure 5-10. 2-Bank and 4-Bank Interleaved Memory Structure 
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to begin in one memory bank while the current bus 
cycle is still activating another memory bank. Figure 
5-10 shows the general structure of the 386 DX with 
2-bank and 4-bank interleaved memory. Note each 
memory bank of the interleaved memory has full 
data bus width (32-bit data width typically, unless 16- 
bit bus size is selected). 


Further details of pipelined address timing are given 
in 5.4.3.4 Pipelined Address, 5.4.3.5 Initiating and 
Maintaining Pipelined Address, 5.4.3.6 Pipelined 
Address with Dynamic Bus Sizing, and 5.4.3.7 
Maximum Pipelined Address Usage with 16-Bit 

Bus Size. | 
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5.4.3 Read and Write Cycles 


5.4.3.1 INTRODUCTION 


Data transfers occur as a result of bus cycles, classi- 
fied as read or write cycles. During read cycles, data 
is transferred from an external device to the proces- 
sor. During write cycles data is transferred in the oth- 
er direction, from the processor to an external de- 
vice. 


Two choices of address timing are dynamically se- 
lectable: non-pipelined, or pipelined. After a bus idle 
state, the processor always uses non-pipelined ad- 
dress timing. However, the NA# (Next Address) in- 
put may be asserted to select pipelined address 
timing for the next bus cycle. When pipelining is se- 
lected and the 386 DX has a bus request pending 
internally, the address and definition of the next cy- 
cle is made available even before the current bus 
cycle is acknowledged by READY #. Generally, the 
NA# input is sampled each bus cycle to select the 
desired address timing for the next bus cycle. 


CYCLE 1- 
NON=PIPELINED 
(WRITE) 


CLK2 [ 


(CLK) ne 


BEO #-BE3 # 
A2= A31, 
M/IO #, D/C # 


CYCLE 2 
NON=PIPELINED 
READ) 
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Two choices of physical data bus width are dynami- 
cally selectable: 32 bits, or 16 bits. Generally, the 
BS16# (Bus Size 16) input is sampled near the end 
of the bus cycle to confirm the physical data bus size 
applicable to the current cycle. Negation of BS16# 
indicates a 32-bit size, and assertion indicates a 16- 
bit bus size. 


If 16-bit bus size is indicated, the 386 DX automati- 
cally responds as required to complete the transfer 
on a 16-bit data bus. Depending on the size and 
alignment of the operand, another 16-bit bus cycle 
may be required. Table 5-7 provides all details. 
When necessary, the 386 DX performs an additional 
16-bit bus cycle, using DO-D15 in place of D16- 
D31. 


Terminating a read cycle or write cycle, like any bus 
cycle, requires acknowledging the cycle by asserting 
the READY # input. Until acknowledged, the proces- 
sor inserts wait states into the bus cycle, to allow 
adjustment for the speed of any external device. Ex- 


ternal hardware, which has decoded the address 


and bus cycle type asserts the READY # input at the 
appropriate time. 


CYCLE 3 IDLE 
NON~PIPELINED 
(WRITE) 


CYCLE 4 IDLE 
NON=PIPELINED 
(READ) 


T2 Ti T1 T2 Ti 


i. <a 
MX ~~ ence Ponta AXKX 
=@20-@2———— 


XAXXXXXXXXXXXAKXAAXKKXRAXAKRXLAXRLANLARLKAXAAXLA 


32=BiT 
BUS SIZE 


~ 
A 


KAKAAXKAKAY 


| READY # [ XXX >) 


LOCK # [ XXX XX 


DO D31 [ : 


ane =a 
KXKXKKXA Lb AXXKXA | 


END CYCLE 1 END CYCLE 2 END CYCLE 3 e 


Kae vasa sR KKK 
ae a | 
~----- (Tor [> -4--- >= ont) -t--- = in) -- 


32=BIT 32=BiT 32=BiT 
BUS SIZE BUS SIZE ‘BUS SIZE 
va a ~ 


NXXKXXXKXY 


ie perms Oh 
AKXKED | AKXKKKXKKD| 


END CYCLE 4 
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Idle states are shown here for diagram variety only. Write cycles are not always followed by an idle state. An active bus cycle can immediately 


foliow the write cycle. 


Figure 5-11. Various Bus Cycles and Idle States with Non-Pipelined Address (zero wait states) 
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At the end of the second bus state within the bus 
cycle, READY # is sampled. At that time, if external 
hardware acknowledges the bus cycle by asserting 
READY #, the bus cycle terminates as shown in Fig- 
ure 5-11. If READY # is negated as in Figure 5-12, 
the cycle continues another bus state (a wait state) 
and READY # is sampled again at the end of that 


state. This continues indefinitely until the cycle is ac- » 


enewecae? by READY # asserted. 


When the current cycle is acknowledged, the 386 
DX terminates it. When a read cycle is acknowl- 
edged, the 386 DX latches the information present 
at its data pins. When a write cycle is acknowledged, 
the 386 DX write data remains valid throughout 
phase one of the next bus state, to provide write 
data hold time. 


5.4.3.2 NON-PIPELINED ADDRESS 


Any bus cycle may be performed with non-pipelined 


address timing. For example, Figure 5-11 shows a. 


mixture of read and write cycles with non-pipelined 
address timing. Figure 5-11 shows the fastest possi- 


CYCLE 1 
NON=PIPELINED 
(READ) 


Ti TI T2 11 


CLK2 ie 
is [ 


BEO #-BE1 # 
A2-A31, | KXXKKX 


M/10 #,D/C # 
w/e [ XXX eno 


ADS # [ 


wae ae benef KX 


asis¢[ XX 


END | eno crete + 1 


CYCLE 2 
NON=PIPELINED 
(WRITE) 


rok em ime 
pe, pte Ste Bus soeSee| | 


| aa mn ae XK on ox 
READY # [_ = XXX me ae LAX fe Y Nw i 
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ble cycles with non-pipelined address have two bus 
states per bus cycle. The states are named T1 and 
T2. In phase one of the T1, the address signals and 
bus cycle definition signals are driven valid, and to 
signal their availability, address status (ADS #) is 
simultaneously asserted. 


During read or write cycles, the data bus behaves as 
follows. If the cycle is a read, the 386 DX floats its 
data signals to allow driving by the external device 
being addressed. The 386 DX requires that all 
data bus pins be at a valid logic state (high or 
low) at the end of each read cycle, when 
READY # is asserted, even if all byte enables are 
not asserted. The system MUST be designed to 
meet this requirement. If the cycle is a write, data 
signals are driven by the 386 DX beginning in phase 
two of T1 until phase one of the bus state following 
cycle acknowledgment. 


Figure 5-12 illustrates non-pipelined bus cycles with 
one wait added to cycles 2 and 3. READY # is sam- 
pled negated at the end of the first T2 in cycles 2 
and 3. Therefore cycles 2 and 3 have T2 repeated. 
At the end of the second T2, READY # is sampled 
asserted. 


CYCLE 3 
NON=PIPELINED 
(READ) 


T2 Ti T1 T2 12 Ti 


SMI 


(aun sD a ma ae XXNK __[vauio s| XK 


eee XXX 


ie. XXX ee 


Lock ¢ [_ ad =“ (vA AA AX Y Aes A XXX 


DO= 031 [ - 
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Idle states are shown here for diagram variety only. Write cycles are not always followed by an idle state. An active bus cycle can immediately _ | 


foliow the write cycle. 


vane 5-12. Various Bus Cycles and Idle States with Non-Pipelined Address 
(various number of wait states) 
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HOLD ASSERTED 


RESET 
ASSERTED 


‘HOLD NEGATED 


REQUEST PENDING 
HOLD NEGATED 


Bus States: 


T1—first clock of a non-pipelined bus cycle (386™ DX drives new address and asserts ADS #) 


or # 
oO” o== HOLD NEGATED « 
REQUEST PENDING 


HOLD NEGATED ¢ 
REQUEST PENDING 


ASSERTED * HOLD NEGATED * NO Reg ee 


READY# ASSERTED ¢ 


READY# NEGATED « 
NA# NEGATED 
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T2—subsequent clocks of a bus cycle when NA# nae not been sampled asserted in the current bus cycle 


Ti— idle state 
| Th—hold acknowledge state (386™ DX asserts HLDA) 
The fastest bus cycle consists of two states: T1 and T2. 


Four basic bus states describe bus operation when not using pipelined address. These states do include BS16# usage for 32-bit and 16-bit 
bus size. If asserting BS16# requires a second 16-bit bus cycle to be performed, it is performed before HOLD asserted is acknowledged. 


Figure 5-13. 386™ DX Bus States (not using pipelined address) 


When address pipelining is not used, the address 
and bus cycle definition remain valid during all wait 
states. When wait states are added and you desire 
to maintain non-pipelined address timing, it is neces- 
sary to negate NA# during each T2 state except the 
last one, as shown in Figure 5-12 cycles 2 and 3. If 
NA # is sampled asserted during a T2 other than the 
last one, the next state would be T2I (for pipelined 
address) or T2P (for pipelined address) instead of 
another T2 (for non-pipelined address). 


When address pipelining is not used, the bus states 
and transitions are completely illustrated by Figure 
5-13. The bus transitions between four possible 
states: T1, T2, Ti, and Th. Bus cycles consist of T1 
and T2, with T2 being repeated for wait states. Oth- 
erwise, the bus may be idle, in the Ti state, or in hold 
acknowledge, the Th state. 


When address pipelining is not used, the bus state 
diagram is as shown in Figure 5-13. When the bus is 


idle it is in state Ti. Bus cycles always begin with T1. 
T1 always leads to T2. if a bus cycle is not acknowl- 
edged during T2 and NA# is negated, T2 is repeat- 
ed. When a cycle is acknowledged during T2, the 
following state will be T1 of the next bus cycle if a 
bus request is pending internally, or Ti if there is no 
bus request pending, or Th if the HOLD input is be- 
ing asserted. 


The bus state diagram in Figure 5-13 also applies to 
the use of BS16#. If the 386 DX makes internal ad- 
justments for 16-bit bus size, the adjustments do not 
affect the external bus states. If an additional 16-bit 
bus cycle is required to complete a transfer on a 
16-bit bus, it also follows the state transitions shown 
in Figure 5-13. 


Use of pipelined address allows the 386 DX to enter 
three additional bus states not shown in Figure 5-13. 
Figure 5-20 in 5.4.3.4 Pipelined Address is the 
complete bus state diagram, including pipelined ad- 
dress cycles. 
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5.4.3.3 NON-PIPELINED ADDRESS WITH | 
DYNAMIC DATA BUS SIZING 


The physical data bus width for any non-pipelined 
bus cycle can be either 32-bits or 16-bits. At the 
beginning of the bus cycle, the processor behaves 
as if the data bus is 32-bits wide. When the bus cy- 
cle is acknowledged, by asserting READY # at the 
end of a T2 state, the most recent sampling of 
BS16# determines the data bus size for the cycle 
being acknowledged. If BS16# was most recently 
negated, the physical data bus size is defined as 


A TRANSFER REQUIRING TWO 
CYCLES ON 16=BIT DATA BUS 


IDLE CYCLE 


-- 386T™ DX MICROPROCESSOR 


32 bits. If BS16# was most recently asserted, the 


size is defined as 16 bits. 


| When BS16# is asserted and two 16-bit bus cycles 


are required to complete the transfer, BS16# must 
be asserted during the second cycle; 16-bit bus size 
is not assumed. Like any bus cycle, the second 16- 
bit cycle must be acknowledged by asserting 
READY #. | 


When a second 16-bit bus cycle is required to com- 


plete the transfer over a 16-bit bus, the addresses 


A TRANSFER REQUIRING TWO 
CYCLES ON 16=BIT DATA BUS 
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Figure 5-14. Asserting BS16# (zero wait states, non-pipelined address) 
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Figure 5-15. Asserting BS16 # (one wait state, non-pipelined address) 


generated for the two 16-bit bus cycles are closely 
related to each other. The addresses are the same 
except BEO# and BE1# are always negated for the 
second cycle. This is because data on DO-D15 was 
already transferred during the first 16-bit cycle. 


Figures 5-14 and 5-15 show cases where assertion 
of BS16# requires a second 16-bit cycle for com- 
plete operand transfer. Figure 5-14 illustrates cycles 
without wait states. Figure 5-15 illustrates cycles 
with one wait state. In Figure 5-15 cycle 1, the bus 


cycle during which BS16# is asserted, note that 
NA# must be negated in the T2 state(s) prior to the 
last T2 state. This is to allow the recognition: of 
BS16# asserted in the final T2 state. Also note that 
during this state BS16# must be stable (defined by 
t17 and t18, BS16 # setup and hold timings), in order 
to prevent potential data corruption during split cycle 
reads. The logic state of BS16# during this time is 
not important. The relation of NA# and BS16# is 
given fully in 5.4.3.4 Pipelined Address, but Figure 


- §-15 illustrates these precautions you need to know 


when using BS16# with non-pipelined address. 
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5.4.3.4 PIPELINED ADDRESS 


Address pipelining is the option of requesting the 
address and the bus cycle definition of the next, in- 


ternally pending bus cycle before the current bus . 


cycle is acknowledged with READY# asserted. 
ADS # is asserted by the 386 DX when the next ad- 
dress is issued. The address pipelining option is con- 
trolled on a cycle- eae basis with the NA# input 
signal. 


Once a bus cycle is in progress and the current ad- 
dress has been valid for at least one entire bus 
state, the NA# input is sampled at the end of every 


phase one until the bus cycle is acknowledged. Dur-. 


ing non-pipelined bus cycles, therefore, NA# is 
sampled at the end of phase one in every T2. An 
example is Cycle 2 in Figure 5-16, during which NA# 
is sampled at the end of phase one of every T2 (it 
was asserted once during the first T2 and has no 
further effect during that bus cycle). 
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If NA# is sampled asserted, the 386 DX is free to 
drive the address and bus cycle definition of the next 
bus cycle, and assert ADS#, as soon as it has a bus 


request internally pending. It may drive the next ad- 
dress as early as the next bus state, whether the 


current bus cycle is acknowledged at that time or 
not. 


Regarding the details of address pipelining, the 386 
DX has the following characteristics: 


1) For NA# to be sampled asserted, ‘BSi6# must | 
be negated at that sampling window (see Figure 
5-16 Cycles 2 through 4, and Figure 5-17 Cycles 1 
through 4). If NA# and BS16#4 are both sampled 
asserted during the last T2 period of a bus cycle, 
BS16# asserted has priority. Therefore, if both 
are asserted, the current bus size is taken to be 
16 bits and the next address is not pipelined. 
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allan any idle bus state (Th, addresses are non-pipelined. Within non-pipelined bus cycles, NA# is only sampled during wait states. 
Therefore, to begin address pipelining during a group of non-pipelined bus cycles requires a non-pipelined cycle with at least one wait state 


(Cycle 2 above). 


Figure 5-16. Transitioning to Pipelined Address During Burst of Bus Cycles 
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Following any idle bus state (Ti) the address is always non-pipelined and NA# is only sampled during wait states. To start address pipelining 
after an idle state requires a non-pipelined cycle with at least one wait state (cycle 1 above). 
The pipelined cycles (2, 3, 4 above) are shown with various numbers of wait states. 


Figure 5-17. Fastest Transition to Pipelined Address Following Idle Bus State 


2) The next address may appear as early as the bus 
state after NA# was sampled asserted (see Fig- 
ures 5-16 or 5-17). In that case, state T2P is en- 

tered immediately. However, when there is not an 
internal bus request already pending, the next ad- 
dress will not be available immediately after NA# 
is asserted and T2l is entered instead of T2P (see 
Figure 5-19 Cycle 3). Provided the current bus cy- 
cle isn’t yet acknowledged by READY # asserted, 
T2P will be entered as soon as the 386 DX does 


drive the next address. External hardware should. 


therefore observe the ADS# output as confirma- 
tion the next address is actually being driven on 
the bus. 


3) Once NA# is sampled asserted, the 386 DX com- 
_ mits itself to the highest priority bus request that 

_ is pending internally. tt can no longer perform an- 
other 16-bit transfer to the same address should 
BS16# be asserted externally, so thereafter 


must assume the current bus size is 32 bits. 
‘Therefore if NA# is sampled asserted within a 
bus cycle, BS16# must be negated thereafter in 
that bus cycle (see Figures 5-16, 5-17, 5-19). 
Consequently, do not assert NA# during bus cy- 
cles which must have BS16# driven asserted. 
See 5.4.3.6 Dynamic Bus Sizing with. Pipelined 
Address. 


4) Any address which is validated by a pulse on the 
386 DX ADS# output will remain stable on the 
address pins for at ieast two processor clock peri- 
ods. The 386 DX cannot produce a new address 
more frequently than every two processor clock 
periods (see Figures 5-16, 5-17, 5-19). 


5) Only the address and bus cycle definition of the 
very next bus cycle is available. The pipelining ca- 
pability cannot look further than one bus cycle 
ahead (see Figure 5-19 Cycle 1). 
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The complete bus state transition diagram, including 
operation with pipelined address is given by 5-20. 
Note it is a superset of the diagram for non-pipelined 
address only, and the three additional bus states for 
pipelined address are drawn in bold. 


The fastest bus cycle with pipelined address con- 
sists of just two bus states, T1P and T2P (recall for 
non-pipelined address it is T1 and T2). T1P is the 
first bus state of a pipelined cycle. 


5.4.3.5 INITIATING AND MAINTAINING 
PIPELINED ADDRESS 


Using the state diagram Figure 5-20, observe the 
transitions from an idle state, Ti, to the beginning of 
a pipelined bus cycle, T1P. From an idle state Ti, the 
first bus cycle must begin with T1, and is therefore a 
non-pipelined bus cycle. The next bus cycle will be 
pipelined, however, provided NA# is asserted and 
the first bus cycle ends in a T2P state (the address 
for the next bus cycle is driven during T2P). The fast- 
est path from an idle state to a bus cycle with pipe- 
lined address is shown in bold below: 


Ti, Ti, Ti, .T1-T2-T2P,, T1P-TaP, 


idle non-pipelined 
states cycle 


pipelined 
cycle 


T1-T2-T2P are the states of the bus cycle that es- 
tablishes address pipelining for the next bus cycle, 
which begins with T1P. The same is true after a bus 
hold state, shown below: 


Th, Th, Th, T1-12-T2P, TIP -T2P, 
Sec esate a, Wicieneinael/ 2 tela geaeet 


hold non-pipelined pipelined 
acknowledge cycle cycle 
states © 


The transition to pipelined address is shown func- 
tionally by Figure 5-17 Cycle 1. Note that Cycle 1 is 
used to transition into pipelined address timing for 
the subsequent Cycles 2, 3 and 4, which are pipe- 
lined. The NA# input is asserted at the appropriate 
time to select address pipelining for Cycles 2, 3 
and 4. : 


386™ DX MICROPROCESSOR 


Once a bus cycle is in progress and the current ad- 


_ dress has become valid, the NA# input is sampled 


at the end of every phase one, beginning with the 
next bus state, until the bus cycle is acknowledged. 


During Figure 5-17 Cycle 1 therefore, sampling be- 


gins in T2. Once NA# is sampled asserted during 
the current cycle, the 386 DX is free to drive a new 
address and bus cycle definition on the bus as early 
as the next bus state. In Figure 5-16 Cycle 1 for 
example, the next address is driven during state 
T2P. Thus Cycle 1 makes the transition to pipelined 
address timing, since it begins with T1 but ends with 
T2P. Because the address for Cycle 2 is available 
before Cycle 2 begins, Cycle 2 is called a pipelined 
bus cycle, and it begins with T1P. Cycle 2 begins as 
soon as READY # asserted terminates Cycle 1. 


Example transition bus cycles are Figure 5-17 Cycle 
1 and Figure 5-16 Cycle 2. Figure 5-17 shows tran- 
sition during the very first cycle after an idle bus 
state, which is the fastest possible transition into ad- 
dress pipelining. Figure 5-16 Cycle 2 shows a tran- 
sition cycle occurring during a burst of bus cycles. In 
any case, a transition cycle is the same whenever it 
occurs: it consists at least of T1, T2 (you assert 
NA# at that time), and T2P (provided the 386 DX 


has an internal bus request already pending, which it 


almost always has). T2P states are repeated if wait 
states are added to the cycle. 


Note three states (T1, T2 and T2P) are only required 
in a bus cycle performing a transition from non- 
pipelined address into pipelined address timing, for 
example Figure 5-17 Cycle 1. Figure 5-17 Cycles 2, 
3 and 4 show that address pipelining can be main- 
tained with two-state bus cycles consisting only of 
T1P and T2P. 


Once a pipelined bus cycle is in progress, pipelined 


timing is maintained for the next cycle by asserting 
NA# and detecting that the 386 DX enters T2P dur- | 
ing the current bus cycle. The current bus cycle must 
end in state T2P for pipelining to be maintained in 
the next cycle. T2P is identified by the assertion of 
ADS #. Figures 5-16 and 5-17 however, each show | 
pipelining ending after Cycle 4 because Cycle 4 
ends in T2l. This indicates the 386 DX didn’t have an 
internal bus request prior to the acknowledgement 
of Cycle 4. If a cycle ends with a T2 or T2l, the next 
cycle will not be pipelined. 
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Figure 5-19. Details of Address Pipelining During Cycles with Wait States 
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Bus States: 


Ti—first clock of a non-pipelined bus cycle (386™ DX drives new: address © 


and asserts ADS #). 
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T2—subsequent clocks of a bus cycle when NA# has not been sampled 
asserted in the current bus cycle. 

T2I—subsequent clocks of a bus cycle when NA# has been sampled as- 
serted in the current bus cycle but there is not yet an internal bus request 
pending (386 DX will not drive new address or assert ADS #). 
T2P—subsequent clocks of a bus cycle when NA# has been sampled 
asserted in the current bus cycle and there is an internal bus request pene: 
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ing (8386 DX drives new address and asserts ADS #). 
T1P—first clock of a pipelined bus cycle. 

Ti—idle state. 

Th—hold acknowledge state (386 DX asserts HLDA). 


Asserting NA# for pipelined address gives access to three more bus 


states: T2I, T2P and T1P. 
Using pipelined address, the fastest bus cycle consists of T1P and 
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Figure 5-20. 386™ DX Complete Bus States (including pipelined address) 


Realistically, address pipelining is almost always 
maintained as long as NA# is sampled asserted. 
This is so because in the absence of any other re- 
quest, a code prefetch request is always internally 
pending until the instruction decoder and code pre- 
fetch queue are completely full. Therefore address 
pipelining is maintained for long bursts of bus cycles, 
if the bus is available (i.e., HOLD negated) and NA# 
is sampled asserted in each of the bus cycles. 


5.4.3.6 PIPELINED ADDRESS WITH DYNAMIC 
DATA BUS SIZING 


The BS16# feature allows easy interface to 16-bit 
data buses. When asserted, the 386 DX bus 


interface hardware performs appropriate action to 
make the transfer using a 16-bit data bus connected 
on bO- D15. 


There is a degree of interaction, however, between 
the use of Address Pipelining and the use of Bus 
Size 16. The interaction results from the multiple bus 
cycles required when transferring 32-bit operands 
over a 16-bit bus. If the operand requires both 16-bit 
halves of the 32-bit bus, the appropriate 386 DX ac- 
tion is a second bus cycle to complete the operand’s 
transfer. It is this necessity that conflicts with NA# 
usage. 


When NA# is sampled asserted, the 386 DX 
commits itself to perform the next __inter- 
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nally pending bus request, and is allowed to drive 
the next internally pending address onto the bus. As- 
serting NA# therefore makes it impossible for the 
next bus cycle to again access the current address 
on A2-A31, such as may be required when BS16# 
is asserted by the external hardware. 


To avoid conflict, the 386 DX is designed with follow- 
ing two provisions: 


1) To avoid conflict, BS16# must be negated in the 
current bus cycle if NA# has already been 


sampled asserted in the current cycle. If NA# is 
sampled asserted, the current data bus size is as- 
sumed to be 32 bits. 


2) To also avoid conflict, if NA# and BS16# are 
both asserted during the same sampling window, 
BS16# asserted has priority and the 386 DX acts 
as if NA# was negated at that time. Internal 386 
DX circuitry, shown conceptually in Figure 5-18, 
assures that BS16# is sampled asserted and 
NA# is sampled negated if both inputs are exter- 
nally asserted at the same sampling window. 
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Cycle 1 is pipelined. Cycle 1a cannot be pipelined, but its address can be inferred from that of Cycle 1, to externally simulate address pipelining 


during Cycle 1a. 


Figure 5-21. Using NA# and BS16# 
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Certain types of 16-bit or 8-bit operands require no 


.adjustment for correct: transfer.on a 16-bit bus. 
Those are read or write operands using only the low- 


er half of the data bus, and write operands using 
only the upper half of the bus since the 386 DX 
simultaneously duplicates the write data on the low- 
er half of the data bus. For these patterns of Byte 
Enables and the R/W# signals, BS16# need not be 
asserted at the 386 DX allowing NA# to be asserted 
during the bus cycle -if desired. 


5.4.4 Interrupt Acknawietes (NTA) 
Cycles 


In response to an interrupt request on the INTR in- 
put when interrupts are enabled, the 386 DX per- 
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forms two interrupt acknowledge cycles. These bus 
cycles are similar to read cycles in that bus definition 
signals define the type of bus activity taking place, 


and each cycle continues until. acknowledged by 


READY # sampled asserted. 


The state of A2 distinguishes the first aiid second 
interrupt acknowledge cycles. The byte address 
driven during the first interrupt acknowledge cycle is 
4 (A31-A3 low,. A2 high, BE3#-—BE1# high, and 
BEO# low). The address driven during the second 
interrupt acknowledge cycle is 0 (A31-A2 low, 


BES #-BE1 # high, BEO# low). 


IDLE 3 INTERRUPT 
(4 BUS STATES) ACKNOWLEDGE 
| ae CYCLE 2 


Ti Ti. °° TOT T2 -T2I Ti 


Ch YT 


XXX XY — ERO XY ERARXKE RXR “j=. AAAS 


ca mene ce ne od aT 

AXXXXKXXXKX EE RY — TKK 
ee en FR! eee ae eS 

weno XXRXXXXRXKRY | A ARXRXKERX TORRE? AN 


IGNORED 


DB-D31 [- 


.| Interrupt Vector (0-255) is read on DO-D7 at end of second interudl Acknowledge bus cycle. 


OK 
AVAVA) 


& 


Cy 


. 231630-26 


Because each Interrupt Acknowledge bus cycle is followed oy idle bus states, asserting NA# has no- aia effect. Choose the approach 


which is simplest for your system hardware design. 


Figure 5-22. Interrupt Acknowledge aoe 
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Figure 5-23. Halt indication Cycle 


The LOCK# output is asserted from the beginning 
of the first interrupt acknowledge cycle until the end 
of the second interrupt acknowledge cycle. Four idle 
bus states, Ti, are inserted by the 386 DX between 
the two interrupt acknowledge cycles, allowing for 
compatibility with spec TRHRL of the Bene Inter- 
rupt Controller. 


During both interrupt acknowledge cycles, DO-D31 
float. No data is read at the end of the first interrupt 
acknowledge cycle. At the end of the second inter- 
rupt acknowledge cycle, the 386 DX will read an ex- 
ternal interrupt vector from DO-—D7 of the data bus. 
The vector indicates the specific interrupt number 
(from 0-—255) requiring service. 


5.4.5 Halt Indication Cycle 


The 386 DX halts as a result of executing a HALT 
instruction. Signaling its entrance into the halt state, 
a halt indication cycle is performed. The halt indica- 
tion cycle is identified by the state of the bus defini- 
tion signals shown in 5.2.5 Bus Cycle Definition 
and a byte address of 2. BEO# and BE2# are the 
only signals distinguishing halt indication from shut- 
down indication, which drives an address of 0. Dur- 
ing the halt cycle undefined data is driven on 
DO-—D31. The halt indication cycle must be acknowl- 
edged by READY # asserted. 


A halted 386 DX resumes execution when INTR (if 
interrupts are enabled) or NMI or RESET is assert- 
ed. 
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5.4.6 Shutdown Indication Cycle 


The 386 DX shuts down as a result of a protection © 


fault while attempting to process a double fault. Sig- 
naling its entrance into the shutdown state, a shut- 
‘down indication cycle is performed. The shutdown 
indication cycle is identified by the state of the bus 
definition signals shown in 5.2.5 Bus Cycle Defini- 
tion and a byte address of 0. BEO# and BE2# 
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are the only signals distinguishing shutdown indica- 
tion from halt indication, which drives an address of 
2. During the shutdown cycle undefined data is driv- 
~en on DO-D31. The shutdown indication cycle must 
be acknowledged by READY # asserted. | 


A shutdown 386. DX resumes execution when NMI 
or RESET is asserted. 
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Figure 5-24. Shutdown Indication Cycle 
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5.5 OTHER FUNCTIONAL 
DESCRIPTIONS 


5.5.1 Entering and Exiting Hold 
Acknowledge 


The bus hold acknowledge state, Th, is entered in 
response to the HOLD input being asserted. In the 
bus hold acknowledge state, the 386 DX floats all 
output or bidirectional signals, except for HLDA. 
HLDA is asserted as long as the 386 DX remains in 
the bus hold acknowledge state. In the bus hold ac- 
knowledge state, all inputs except HOLD, RESET, 
BUSY #, ERROR#, and PEREQ are ignored (also 
up to one rising edge on NMI is remembered for 
processing when HOLD is no longer asserted). 
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NOTE: 

For maximum design flexibility the 386™ DX has no in- 
ternal pullup resistors on its outputs. Your design may 
require an external pullup on ADS# and other 386 DX 
outputs to keep them negated during float periods. 


Figure 5-25. Requesting Hold from Idle Bus 


Th may be entered from a bus idle state as in Figure 


5-25 or after the acknowledgement of the current 


physical bus cycle if the LOCK # signal is not assert- 
ed, as in Figures 5-26 and 5-27. If HOLD is asserted 
during a locked bus cycle, the 386 DX may execute 
one unlocked bus cycle before acknowledging 
HOLD. If asserting BS16# requires a second 16-bit 
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bus cycle to complete a physical operand transfer, it 
is performed before HOLD is acknowledged, al- 
though the bus state diagrams in Figures 5-13 and 
5-20 do not indicate that detail. 


Th is exited in response to the HOLD input being 
negated. The following state will be Ti as in Figure 
5-25 if no bus request is pending. The following bus 
state will be T1 if a bus request is internally pending, 
as in Figures 5-26 and 5-27. 


Th is also exited in response to RESET being assert- 
ed. 


lf a rising edge occurs on the edge-triggered NMI 
input while in Th, the event is remembered as a non- 
maskable interrupt 2 and is serviced when Th is exit- 
ed, unless of course, the 386 DX is reset before Th 


_is exited. 


5.5.2 Reset During Hold Acknowledge 


RESET being asserted takes priority over HOLD be- 


ing asserted. Therefore, Th is exited in reponse to 
the RESET input being asserted. If RESET is assert- 
ed while HOLD remains asserted, the 386 DX drives 
its pins to defined states during reset, as in Table 
5-3 Pin State During Reset, and performs internal 
reset activity as usual. 


If HOLD remains asserted when RESET is negated, 
the 386 DX enters the hold acknowledge state be- 
fore performing its first bus cycle, provided HOLD is 
still asserted when the 386 DX would otherwise per- 
form its first bus cycle. If HOLD remains asserted 
when RESET is negated, the BUSY # input is still 
sampled as usual to determine whether a self test is 
being requested, and ERROR # is still sampled as 
usual to determine whether a 387 DX coprocessor 
vs. an 80287 (or none) is present. 


5.5.3 Bus Activity During and 
Following Reset 


RESET is the highest priority input signal, capable of 
interrupting any processor activity when it is assert- 
ed. A bus cycle in progress can be aborted at any 
stage, or idle states or bus hold acknowledge states 
discontinued so that the reset state is established. - 


RESET should remain asserted for at least 15 CLK2 
periods to ensure it is recognized throughout the 386 
DX, and at least 80 CLK2 periods if 386 DX self-test 
is going to be requested at the falling edge. RESET 
asserted pulses less than 15 CLK2 periods may not 
be recognized. RESET pulses less than 80 CLK2 
periods followed by a self-test may cause the self- 
test to report a failure when no true failure exists. 
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HOLD is a Synchronous input and can be asserted at any CLK2 sdaet provided setup and hold (3 and ue require- | 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


Figure p26. Requesting Hold from Active Bus (NA# negated) 


The additional RESET pulse width is required to 
clear additional state prior to a valid self-test. 


Provided the RESET falling edge meets setup and 
hold times tes and tog, the internal processor clock 
phase is defined at that time, as illustrated by rigure 
5-28 and Figure 7-7. 


A 386 DX self-test may be requested at the time 
RESET is negated by having the BUSY # input at a 
LOW level, as shown in Figure 5-28. The self-test 
requires (220) + approximately 60 CLK2 periods to” 
complete. The self-test duration is not affected by 
_ the test results. Even if the self-test indicates a prob- 

lem, the 386 DX attempts to proceed with the reset 
sequence aitonwale> 
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After the RESET falling edge (and after the self-test 
if it was requested) the 386 DX performs an internal 
initialization sequence for approximately 350 to 450 
CLK2 periods. | | 


_ The 386 DX samples its ERROR # input some time 


after the falling edge of RESET and before execut- 
ing the first ESC instruction. During this sampling pe- 


riod BUSY # must be HIGH. If ERROR# was sam- 


pled active, the 386 DX employs the 32-bit protocol 
of the 387 DX. Even though this protocol was select- 
ed, it is still necessary to use a software recognition 
test to determine the presence or identity of the co- 
processor and to assure compatibility with future 
processors. (See Chapter 11 of the 386™ DX Pro- 
grammer’s Reference Manual, Order #230985- 
002). 
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HOLD is a synchronous input and can be el at any CLK2 edge, provided setup and hold (to3 and te,4) require- 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


Figure 5-27. Requesting Hold from Active Bus (NA# asserted) 


9.6 SELF-TEST SIGNATURE 


Upon completion of self-test, (if self-test was re- 
quested by holding BUSY# LOW at least eight 
CLK2 periods before and after the falling edge of 
RESET), the EAX register will contain a signature of 
00000000h indicating the 386 DX passed its self- 
test of microcode and major PLA contents with no 
problems detected. The passing signature in EAX, 
00000000h, applies to all 386 DX revision levels. 
Any non-zero signature indicates the 386 DX unit is 
faulty. 


5.7 COMPONENT AND REVISION 
IDENTIFIERS 


To assist 386 DX users, the 386 DX after reset holds 
a component identifier and a revision identifier 


in its DX register. The upper 8 bits of DX hold O3h as 
identification of the 386 DX component. The lower 8 


bits of DX hold an 8-bit unsigned binary number re- 


lated to the component revision level. The revision 
identifier begins chronologically with a value zero 
and is subject to change (typically it will be incre- 
mented) with component steppings intended to have 
certain improvements or distinctions from Bieyious 
steppings. 


These features are intended to assist 386 DX users 
to a practical extent. However, the revision identifier 
value is not guaranteed to change with every step- 
ping revision, or to follow a completely uniform nu- 
merical sequence, depending on the type or inten- 
tion of revision, or manufacturing materials required 
to be changed. Intel has’sole discretion over these 
characteristics of the component. 
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INTERNAL CYCLE 1 


RESET INITIALIZATION 


>15 CLK2 DURATION IF 
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1. BUSY# should be held stable for 8 CLK2 periods before and after the CLK2 period in which RESET falling edge 


occurs. 


2. If self-test is requested, the 386™ DX outputs remain in their reset state as shown here and in Table 5-3. 


Figure 5-28. Bus Activity from Reset Until First Code Fetch 


Table 5-10. Component and Revision Identifier History 
386™ DX 


Stepping 
Name 


386™ DX 
Stepping 
Name 


Bo | os. 
B1 ; 03 


Component | 
identifier 


Component | Revision 
Identifier | identifier 
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5.8 COPROCESSOR INTERFACING 


The 386 DX provides an automatic interface for the 
Intel 387 DX numeric floating-point coprocessor. 
The 387 DX coprocessor uses an |/O-mapped inter- 
face driven automatically by the 386 DX and assist- 
ed by three dedicated signals: BUSY #, ERROR#, 
and PEREQ. | 


As the 386 DX begins supporting a coprocessor in- 
struction, it tests the BUSY # and ERROR # signals 
to determine if the coprocessor can accept its next 
instruction. Thus, the BUSY # and ERROR # inputs 
eliminate the need for any “preamble” bus cycles 
for communication between processor and coproc- 
essor. The 387 DX can be given its command op- 
code immediately. The dedicated signals provide in- 
struction synchronization, and eliminate the need of 
using the 386 DX WAIT opcode (9Bh) for 387 DX 
coprocessor instruction synchronization (the WAIT 
opcode was required when 8086 or 8088 was used 
with the 8087 coprocessor). 


Custom coprocessors can be included in 386 DX- 
based systems, via memory-mapped or |/O-mapped 
interfaces. Such coprocessor interfaces allow a 
completely custom protocol, and are not limited to a 
set of coprocessor protocol “primitives”. Instead, 
memory-mapped or |/O-mapped interfaces may use 
all applicable 386 DX instructions for high-speed co- 
processor communication. 
ERROR # inputs of the 386 DX may also be used for 
the custom coprocessor interface, if such hardware 
assist is desired. These signals can be tested by the 
386 DX WAIT opcode (9Bh). The WAIT instruction 
will wait until the BUSY # input is negated (interrupt- 
able by an NMI or enabled INTR input), but gener- 
ates an exception 16 fault if the ERROR# pin is in 
the asserted state when the BUSY# goes (or is) 
negated. If the custom coprocessor interface is 
memory-mapped, protection of the addresses used 


for the interface can be provided with the 386 DX — 


The BUSY# and 


386™ DX MICROPROCESSOR 


on-chip paging or segmentation mechanisms. If the 
custom interface is |/O-mapped, protection of the 
interface can be provided with the 386 DX IOPL (I/O 
Privilege Level) mechanism. 


The 387 DX numeric coprocessor interface is |/O 
mapped as shown in Table 5-11. Note that the 
387 DX coprocessor interface addresses are be- 
yond the Oh-FFFFh range for programmed 1/O. 
When the 386 DX supports the 387 DX coprocessor, 
the 386 DX automatically generates bus cycles to 
the coprocessor interface addresses. | 


- Table 5-11. Numeric Coprocessor 
Port Addresses 


387TMDX 
Coprocessor 
Register 


Address in 
386™ DX 
I/O Space 


| 800000F8h Opcode Register 
(32-bit port) 


800000FCh 


Operand Register 
(32-bit port) 


To correctly map the 387 DX coprocessor registers 
to the appropriate |/O addresses, connect the 
387 DX coprocessor CMDO# pin directly to the A2 
output of the 386 DX. 


5.8.1 Software Testing for 
Coprocessor Presence 


When software is used to test for coprocessor 
(387 DX) presence, it should use only the following 
coprocessor opcodes: FINIT, FNINIT, FSTCW mem, 
FSTSW mem, FSTSW AX. To use other coproces- 
sor opcodes when a coprocessor is known to be not 
present, first set EM = 1 in 386 DX CRO. 
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6 INSTRUCTION SET 


This section describes the 386 Dx instruction set. A 
table lists all instructions along with instruction en- 
coding diagrams and clock counts. Further details of 
the instruction encoding are then provided in the fol- 
lowing sections, which completely describe the en- 
coding structure and the definition of all fields occur- 
ring within 386 DX instructions. 


6.1 386™ DX INSTRUCTION 
"ENCODING AND CLOCK COUNT 
SUMMARY 


To calculate elapsed time for an instruction, multiply 
the instruction clock count, as listed in Table 6-1 
below, by the processor clock period (e.g. 50 ns for 


a 20 MHz 386 DX, 40 ns for a 25 MHz 386 DX, and 


30 ns fora 33 ee 386 DX). 


For more detailed information on the encodings of 
instructions refer to section 6.2 Instruction Encod- 
ings. Section 6.2 explains the general structure of 
instruction encodings, and defines exactly the en- 


codings of all fields contained within the instruction. 


Instruction Clock Count Assumptions 
1. The instruction has been prefetched, decoded, 
_ and is ready for execution. . 
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2. Bus cycles do not require wait states. 


3. There are no local bus HOLD request delaying 
processor access to the bus. © 


4. No exceptions are detected during instruction ex- 
ecution. 


5. If an effective address is calculated, it does not 
use two general register components. One regis- 
ter, scaling and displacement can be used within 
the clock counts shown. However, if the effective 
address calculation uses two. general register 
components, add 1 clock to the clock count 
shown. 


Instruction Clock Count Notation 


1. If two clock counts are given, the smaller refers to 
a register operand and the erger refers to a mem- 
ory operand. 


2.n = number of times repeated. 


3. m = number of components in the next instruc- 
tion executed, where the entire displacement (if 
any) counts as one component, the entire imme- 
diate data (if any) counts as one component, and 
each of the other bytes of the instruction and pre- 
fix(es) each count as one component. 


Wait States 


Add 1 clock per wait state to instruction execution 
for each data access. , 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual | Address 
8086 Mode 8086 Mode 
Mode Mode 
GENERAL DATA TRANSFER 
MOV = Move: 
Register to Register/Memory 1000100w 2/2 2/2 b h 
Register/Memory to Register 1000101w 2/4 2/4 b h 
Immediate to Register/Memory 1100011w |mod000_ r/m{ immediate data 2/2 2/2 b | h 
Immediate to Register (short form) 1011w reg | immediate data 2 2 
Memory to Accumulator (short form) 1010000w | full displacement | 4 4 b h 
Accumulator to Memory (short form) 1010001w | full displacement . 2 2 b h 
Register Memory to Segment Register 10001110 | modsreg3 r/m 2/5 | 18/19 b h, i, j 


Segment Register to Register/Memory 10001100 | modsreg3 r/m 2/2 2/2 b h 


MOVSX = Move With Sign Extension 


Register From Register/Memory 00001111 


era ee ee ee ee : 
ioe) | Beer) ee : : 


MOVZX = Move With Zero Extension 


Segment Register (ES, SS or DS) 


Register From Register/Memory 00001111 

PUSH = Push: 

Register/Memory mod110 r/m 5 5 b h 
Register (short form) 2 2 b h 
Segment Register (ES, CS, SS or DS) . 3 "4s h 
| Segment Register (FS or GS) , ‘ ; ; _ 
Immediate immediate data . 2 2 b h 
PUSHA = Push All [01100000 | 18 18 b h 
POP = Pop 

Register/ Memory | s0001111 | mod000 r/m | :) 5. b | h 
Register (short form) 4 4 b h 


000sreg2111 7 21 b h, i, j 


megment register: (FS.0rGS) 00001111 | 10sreg3001 7 rn hij 


POPA = Pop All 01100001 — 24 24 b h 


XCHG = Exchange 


Register/Memory With Register 100001iw ;modreg’ +r/m 3/5 3/5 b, f f,.h 


Register With Accumulator (short form) | 10010 reg Clk Count 3 3 
Virtual 

[IN = input from: 8086 Mode 

Fixed Port 1110010w 12 6*/26** m 
Variable Port 1110110w 13 ts2r** | m 
OUT = Output to: 

Fixed Port 1110011w 10 at/oate | | om 
Variable Port , 11101141wW 11 5*/25** m 


LEA = Load EA to Register 10001101 
* If CPL < IOPL ** If CPL > IOPL 


nN 


mod reg r/m 2 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) | | 
| 


; . Real : Real 
INSTRUCTION . FORMAT Address | Protected | Address Protected 
Mode or Virtual ‘Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 


Mode 
Mode 


Mode 
SEGMENT CONTROL 


LDS = Load Painter to DS 11000101 | modreg~ = r/m 


LES = Load Pointer to ES 11000100 | mod reg sr/m 


LFS = Load Pointer to FS 00001111 | 10110100 


LGS = Load Pointer to GS 00001111 | 10110101 | modreg —r/m 


10110010 


LSS = Load Pointer to SS 00001111 


FLAG CONTROL 


CLC = Clear Carry Flag 41111000 


CLD = Clear Direction Flag 11111100. 


CLI = Clear Interrupt Enable Fiag 11111010 
CLTS = Clear Task Switched Flag 00001111 00000110 


CMC = Complement Carry Flag 11110101 


LAHF = Load AH into Flag | 10011114 
POPF = Pop Flags 10011101 
PUSHF = PushFiags _ 10011100 
SAHF = Store AH into Flags - 10011110 
STC = Set Carry Flag | | 11111001 


STD = Set Direction Flag . 11111101 


‘STI = Set interrupt Enable Flag - 11111011 


ARITHMETIC 
ADD = Add 


Register to Register 000000dw | modreg r/m 
Register to Memory 000 0 000w | modreg r/m 


Memory to Register 0000001w | modreg r/m 


immediate to Register/ Memory 100000sw | mod000 1r/m| immediate data 


Immediate to Accumulator (short form) 0000010w immediate data 


ADC = Add With Carry 


Register to Register 000100dw | modreg r/m 


Register to Memory 0001000w | modreg r/m 


Memory to Register 0001001w | modreg- r/m 


Immediate to Register/Memory 100000sw {| mod010_ r/m| immediate data 


Immediate to Accumulator (short form) 0001010w immediate data 


INC = Increment 
Register/Memory ~|4111141w | mod000 = r/m 


Register (short form) \ ~|01000-— reg 
SUB = Subtract | 


Register from Register 


001010dw | modreg r/m 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


INSTRUCTION 


ARITHMETIC (Continued) 
Register from Memory 
Memory from Register 


Immediate from Register/ Memory 


Immediate from Accumulator (short form) 


SBB = Subtract with Borrow 
Register from Register 
Register from Memory 
Memory from Siacisiae 


Immediate from Register/Memory 


Immediate from Accumulator (short form) 


DEC = Decrement 
Register/Memory 
Register (short form) 
CMP = Compare 
Register with Register 


Memory with Register 


Register with Memory 


Immediate with Register/ Memory 


Immediate with Accumulator (short form) 


NEG = Change Sign 

AAA = ASCII Adjust for Add 

AAS = ASCIi Adjust for Subtract 
DAA = Decimal Adjust for Add 
DAS = Decimal Adjust for Subtract 
MUL = Multiply (unsigned) i“ 


Accumulator with Register/Memory 
Multiplier-Byte 
-Word 
-Doubleword 
IMUL = Integer Multiply (signed) 
Accumulator with Register/ Memory 
Multiplier-Byte 
-Word 
-Doubleword 


Register with Register/Memory 


Multiplier-Byte 
-Word 
-Doubleword 


FORMAT 


0010100w 
0010101Ww 
100000sw 


0010110w 


000110dw 


0001100w 


100000sw 


0001110w 


111711171wW 


0011100w 
100000sw 


0011110w 


1111011w 


00101111 


00001111 


mod reg r/m 
mod reg r/ 
mod101  r/m} immediate data 


immediate data 


mod reg r/m 
mod reg r/m 
mod reg r/m 
mod011  r/m| immediate data 


immediate data 


reg001 = r/m 


mod reg r/m 


mod111  r/mj immediate data 
immediate data 


mod011 = r/m 


1111011wW{mod100 f/m 


A 


1111011wimod101 = r/m 


ITTERE 


Register/Memory with Immediate to Register} 0 1101081 immediate data 


-Word 
-Doubleword 
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Real 
Address 
Mode or 

Virtual 

8086 

Mode 


12-17/15-20 
12-25/15-28 
12-41/15-44 


12-17/15-20 
12-25/15-28 
12-41/15-44 


12-17/15-20 
12-25/15-28 
12-41/15-44 


13-26/14-27 


13-42/14-43 


Protected 
Virtual 
Address 
Mode 


12-17/15-20 
12-25/15-28 
12-41/15-44 


12-17/15-20 
12-25/15-28 
12-41/15-44 


12~17/15~20 
12-25/15-28 
42~-41/15-44 


13-26/14-27 
13-42/14-43 


Real 
Address 
Mode or 

Virtual 

8086 

Mode 


Protected 
Virtual 
Address 
Mode 


i 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT | NOTES 


Real _ Real 
Address | Protected | Address | Protected 
Virtual Virtual 
Address Address 
Mode Mode 


INSTRUCTION FORMAT. 


ARITHMETIC (Continued) 
DIV = Divide (Unsigned) 


Accumulator by Register/Memory 1111011w |mod110 f/m 


Divisor—Byte 
—Word 
—~-Doubleword 


IDIV = Integer Divide (Signed) 


Accumulator By Register/Memory £141 01 1w{mod1i11 = r/m 
Divisor—Byte 
. —Word: 
—Doubleword 
AAD = ASCii Adjust for Divide 11010101 | 00001010 
AAM = ASCli Adjust for Multiply ~11010100 | 00001010 
CBW = Convert Byte to Word 10011000] | 
CWD = Convert Word to Double Word] 10011001 | 


LOGIC s 


Shift Rotate Instructions 7 . 
Not Through Carry (ROL, ROR, SAL, SAR, SHL, and SHR) 


Register/Memory by1 1101000w |mod > SE 


: 


Register/Memory by CL . 11101 001w mod r 


Register/Memory by Immediate Count | 1100000w |mod TTT _ r/mlimmed 8-bit data 


Through Carry (RCL and RCR) 


= 
5 


Register/Memory by 1 | 4101000w |mod 


Register/Memory by CL ; 14101001w|modTIT r/m | 
Register/Memory by Immediate Count | 1100000w immed 8-bit data 


TTT instruction 
000 ROL ’ 


001 ROR 
010 RCL 
011 RCR 
100  SHL/SAL 
101 SHR 
111 SAR 


SHLD = Shift Left Double 
Register/Memory by immediate 00001111 


10100100 immed 8-bit data 


\ 


| Register/Memory by CL 00001111} 10100101 |modreg r/m 


ISHRD = Shift Right Double | 
Register/Memory by immediate || 00001111 | 10101100 Imodreg _r/m| immed 8-bit data 


Register/Memory by CL 00001111 | 10101101 |modreg r/m 


AND = And 


Register to Register 


001000dw |modreg r/m 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


INSTRUCTION FORMAT 
Virtual ‘Virtual 
Address Address 
Mode Mode 


LOGIC (Continued) 


Register to Memory 0010000w |mod reg r/m 


Memory to Register 0010001w |mod reg r/m 


Immediate to Register/ Memory 100000sw {mod100_ r/m} immediate data 


immediate to Accumulator (Short Form) 0010010w | immediate data 


TEST = And Function to Flags, No Result 


Register/Memory and Register 1000010w |mod reg r/m 


Immediate Data and Register/Memory 1111011w j|mod000_= r/mj immediate data 


immediate Data and Accumulator 
(Short Form) 1010100w | immediate data 


OR = Or 


Register to Register 000010dw |modreg = r/m 


Register to Memory 0000100w |modreg r/m 


Memory to Register 0000101w /|modreg r/m 


Immediate to Register/Memory | 100000sw |mod001_ r/m| immediate data 


| Immediate to Accumulator (Short Form) 0000110w | immediate data 
XOR = Exclusive Or | 
Register to Register . 001100dw |modreg r/m 


Register to Memory 0011000w |modreg r/m 


Memory to Register 0011001w |modreg r/m 


immediate to Register/Memory 100000sw |mod110_ r/m} immediate data 


Immediate to Accumulator (Short Form) 0011010w | immediate data 


NOT = Invert Register/Memory 1111011w |mod010 r/m 


STRING MANIPULATION 


CMPS = Compare Byte Word -1010011w 
INS = Input Byte/Word from DX Port 0110110WwW 9*/29** 
LODS = Load Byte/Word to AL/AX/EAX| 1010110w 5 


MOVS = Move Byte Word 1010010wW 8 


OUTS = Output Byte/Wordto DX Port | 0110111w 8*/28** 


SCAS = Scan Byte Word 1010111w 8 


STOS = Store Byte/Word from 
AL/AX/EX 1010101w 


XLAT = Translate String 11010111 


REPEATED STRING MANIPULATION 
Repeated by Count in CX or ECX 
REPE CMPS = Compare String 
(Find Non-Match) 11110011 


* If CPL < IOPL — ** If CPL > IOPL 


101001i1w 


5-385 


Intel 386™ DX MICROPROCESSOR 


Table 6-1. 386T™ DX Instruction Set Clock Count Summary (Continued) 


Real 
INSTRUCTION ._. ; FORMAT Address Protected 
pee. A Mode or Virtual 
Virtual Address Virtual 
8086 Mode . 8086 
Mode Mode 


t 


REPEATED STRING MANIPULATION (Continued) 


REPNE CMPS = Compare String Clk Count 


j Virtual 
(Find Match) 11110010]/1010011Ww 8086 Mode 


REP INS = Input String |11110010]0110110w | 428+6n 


REP LODS = Load String 11110010|1010110w 5+6n 


1010010w 8+ 4n 


REP MOVS = Move String 11110010 


REP OUTS = Output String 11110010/0110111w T26+5n 6+ 5n*/26+ 5n** 


REPE SCAS = Scan String 
(Find Non-AL/AX/EAX) 141110011] 1010111W 


REPNE SCAS = Scan String 
(Find AL/AX/EAX) 11110010;1010111w 


REP STOS = Store String 11110010]1010101w 


BIT MANIPULATION 


BSF = Scan Bit Forward 00001111 ]10111100 |modreg = r/m 


10111101 
BT = Test Bit 
Register/Memory, Immediate 00001111/10111010 jmod100 = r/miimmed 8-bit data 


Register/Memory, Register 00001111) 10100011 |jmodreg = r/m 


BSR = Scan Bit Reverse 00001111 


BTC = Test Bit and Complement 


Register/Memory, Immediate | 00001111 )]10111010 jmod111. r/miimmed 8-bit data 
10111014 | 


10111010 |jmod110 = r/mijimmed 8-bit data 


Register/Memory, Register 00001111 


BTR = Test Bit and Reset 
Register/Memory, Immediate 00001111 


10110011 {modreg == r/m 


mod101  r/miimmed 8-bit data 


mod reg r/m 


Register/Memory, Register 00001111 


BTS = Test Bit and Set 
Register/Memory, Immediate 00001111 


. 10111010 


Register/Memory, Register 00001111] 10101011 


CONTROL TRANSFER 
CALL = Call ; 
Direct Within Segment 11101000 | full displacement 


‘| Register/Memory 
Indirect Within Segment 11111111 


mod010 r/m 


Direct Intersegment 10011010 {unsigned full offset, selector 


NOTES: | | . 

+ Clock count shown applies if 1/O permission allows |/O to the port in virtual 8086 mode. If |/O bit map denies permission 
exception 13 fault occurs; refer to clock counts for INT 3 instruction. 

* If CPL < IOPL © ** if CPL > 1OPL 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


INSTRUCTION FORMAT 
Virtual Virtual 
Address Address 
Mode Mode 


CONTROL TRANSFER (Continued) 
Protected Mode Only (Direct Intersegment) 


Via Call Gate to Same Privilege Level 
Via Cail Gate to Different Privilege Level, 
(No Parameters) 
Via Call Gate to Different Privilege Level, 
(x Parameters) 
From 80286 Task to 80286 TSS 
From 80286 Task to 386™ DX TSS 
From 80286 Task to Virtual 8086 Task (386™ DX TSS) 
From 386™ DxX Task to 80286 TSS 
From 386™ DX Task to 386™ DX TSS 
From 386™ DX Task to Virtual 8086 Task (386™ DX TSS) 


indirect Intersegment 11111111 |mod011 r/m 


Protected Mode Only (indirect Intersegment) 
Via Call Gate to Same Privilege Level 
Via Call Gate to Different Privitege Level, 
(No Parameters) 
Via Call Gate to Different Privilege Level, 
(x Parameters) — 
From 80286 Task to 80286 TSS 
From 80286 Task to 386™ DX TSS 
From 80286 Task to Virtual 8086 Task (386™ DX TSS) 
From 386™ DX Task to 80286 TSS 
From 386™ DX Task to 386™ DX TSS 
From 386™ DX Task to Virtual 8086 Task (386™ DX TSS) 
JMP = Unconditional Jump 


Short 11101011 }|8-bit displacement 
Direct within Segment 11101001 | full displacement 
Register/Memory Indirect within Segment} 11111111 |mod 100 r/m 


Direct Intersegment 11101010 |junsigned full offset, selector jkr 


Protected Mode Only (Direct Intersegment) 
Via Call Gate to Same Privilege Level | , h,j,Kr 
From 80286 Task to 80286 TSS h,j, Kr 
From 80286 Task to 386™ DX TSS h,j,.k,r 
From 80286 Task to Virtual 8086 Task (386™ DX TSS) h,j,k,r 
From 386™ DX Task to 80286 TSS h,j,k,r 
From 386™ DX Task to 386™ DX TSS h,j, kr 
From 386™ DX Task to Virtual 8086 Task (386™ DX TSS) h,j,k,r 


Indirect Intersegment 11111111 |mod101 r/m h,j.k.r 


Protected Mode Only (Indirect Intersegment) 
Via Cail Gate to Same Privilege Level h,j,. kr 
From 80286 Task to 80286 TSS h,j,.k,r 
From 80286 Task to 386™ DX TSS h,j,k,r 
From 80286 Task to Virtual 8086 Task (386™ DX TSS) h,j,kr 
From 386™ DX Task to 80286 TSS h,j,k,r 
From 386™ DX Task to 386™ DX TSS h,j,k,r 
From 386™ DX Task to Virtual 8086 Task (886™ DX TSS) h,j,.k.r 
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Table 6-1. 3867 DX Instruction Set Clock Count Summary | (Continued) — 


Address | Protecte 


Protected 


INSTRUCTION ' FORMAT 


CONTROL TRANSFER (Continued) 
RET = Return from CALL: 


Within Segment [11000011 | 
Within Segmiedt Adding Immediate to SP 
[ 11001011 | 


Intersegment 
Intersegment Adding Immediate to SP 


Protected Mode Only (RET): 
to Different Privilege Level 
Intersegment i 
Intersegment Adding Immediate to SP 


CONDITIONAL JUMPS 
NOTE: Times Are Jump ‘Taken or Not Taken” 


JO.= Jump on Overflow 
01110000 8-bit disp! 
00001111 | 10000000 | full displacement 


01110001 8-bit displ. 
00001111 | 10000001 | full displacement 


JB/JNAE = Jump on Below/Not Above or Equal 


01110010 8-bit displ 
00001111 10000010 | full displacement 


JNB/JAE = Jump on Not Below/Above or Equal 
8-Bit Displacement 


8-Bit Displacement 
Full Displacement 


JNO = Jump on Not Overflow 


8-Bit Displacement | 


Full Displacement 


8-Bit Displacement 


Full Displacement 


8-bit displ — 
Full Displacement 


JE/JZ = Jump on Equal/Zero 


8-Bit Displacement 01110100 8-bit disp! 


00001111 10000100 | full displacement 


JNE/JNZ = Jump on Not Equal/Not Zero 


01110101 “8-bit disp! 


Full Displacement 


8-Bit Displacement 
Full Displacement 


| JBE/JNA = Jump on Below or Equal/Not Above 


01110110 8-bit displ 


00001111 10000110 | full displacement 


JNBE/JA = Jump on Not Below or Equal/Above 
01110111 8-bit displ 


00001111 10000111 | full displacement 


01111000 8-bit displ 
00001111 | 10001000 | full displacement 


8-Bit Displacement 


Full Displacement 


8-Bit Displacement 


Full Displacement | 
JS = Jump on Sign 


8-Bit Displacement 


Full Displacement 
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00001111 | 10000011 | full displacement 


00001111 10000101 | full displacement 


7+mor3 


7+mor3 


7+mor3 


7+moar3 


7+ mor3 


17+ mor3 


7+mor3 


7+mor3 


7+mor3 


17+ mor3 


7+mor3 


7+mor3 


it mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


Virtual 
Address 
Mode 


7+ mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


7+mor3 


‘“7+mor3 


7+mor3 


7+mor3 


7+-mor3 


7+mor3 


7+mor3 


7+mor3 


Virtual 
Address 
Mode 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION . FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


CONDITIONAL JUMPS (Continued) - 
JNS = Jump on Not Sign 


8-Bit Displacement 01111001 8-bit displ 7+mor3] 7+ mor3 


Full Displacement 00001111 10001001 | full displacement 7+mor3| 7+mor3 


JP/JPE = Jump on Parity/Parity Even 
8-Bit Displacement 01111010 8-bit displ 7+mor3 


Full Displacement . 00001111 10001010 | full displacement 7+ mor3 


JNP/JPO = Jump on Not Parity/Parity Odd 
8-Bit Displacement 01111011 8-bit displ 


t 


Full Displacement 00001111 


10001011 | full displacement 


JL/JNGE = Jump on Less/Not Greater or Equal 
8-Bit Displacement 01111100 | | 8-bitdispl 7+mor3 


Full Displacement 00001111 10001100 | full displacement ‘| 7+ mor3 


| JNL/JGE = Jump on Not Less/Greater or Equal . 
8-Bit Displacement 01111101 8-bit displ | . 7+ mor3 


Full Displacement 00001111 10001101 | full displacement 7+ mor3 


JLE/JNG = Jump on Less or Equal/Not Greater 
8-Bit Displacement 01111110 8-bit displ | : 7+mor3 


Full Displacement 00001111 


10001110 | full displacement 7+mor3 


| JNLE/JG = Jump on Not Less or Equal/Greater 
8-Bit Displacement O11141111 


8-bit displ | 7+mor3| 7+mor3 


Full Displacement 00001111 10001111 | full displacement 7+ mor3; 7+ mor3 


JCXZ = Jump on CX Zero 11100011 8-bit disp! 9+ mor5| 9+ mors 


JECXZ = Jump on ECX Zero 11100011 | B-bitdispl _ 9+mor5| 9+ mors 


(Address Size Prefix Differentiates JCXZ from JECXZ) 


| LOOP = Loop CX Times 11100010 | — B-bitdispI 


LOOPZ/LOOPE = Loop with 
Zero/Equal 11100001 | _ 8-bitdispl 


LOOPNZ/LOOPNE = Loop While 
Not Zero 


| _8-bit disp! 


1100000 
CONDITIONAL BYTE SET 

NOTE: Times Are Register/Memory 

SETO = Set Byte on Overflow | 


To Register/Memory 00001111 10010000 | mod000_ r/m 


| SETNO = Set Byte on Not Overflow 


To Register/Memory 00001111 10010001 |mod000 = r/m 


10010010 {mod000- r/m 


SETB/SETNAE = Set Byte on Below/Not Above or Equal 
To Register/Memory | 00007111 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUN NOTES 


Real Real 


INSTRUCTION : _* FORMAT _ Address | Protected Address | Protected 

as —_ : Mode or Virtual Mode or Virtual 
Virtual Address Virtual . Address 
8086 Mode 8086 


Mode 


Mode Mode 


7 


CONDITIONAL BYTE SET (Continued) 
SETNB = Set Byte on Not Below/Above or Equal 


To Register/Memory 00001111 | 10010011 | mod000 r/m| 


SETE/SETZ = Set Byte on Equal/Zero _ 


To Register/Memory 00001111 | 10010100 | mod000 f/m 


SETNE/SETNZ = Set Byte on Not Equal/Not Zero 


To Register/Memory | 00001111 |.10010101 | mod000 f/m 


SETBE/SETNA = Set Byte on Below or Equal/Not Above 


To Register/Memory | 00001111 | 10010110 |mod000 t/m 


SETNBE/SETA = Set Byte on Not Below or Equal/Above 


To Register/Memory:| 00001111 10010111 |mod000 f/m 


SETS = Set Byte on Sign 
' To Register/Memory 00001111'} 10011000 | mod000 f/m 


SETNS = Set Byte on Not Sign 


To Register/Memory, 00001111 | 10011001 |mod000 r/m 


SETP/SETPE = Set Byte on Parity/Parity Even 


To Register/Memory | 00001111 | 10011010 | mod000 ¢/mj. 
SETNP/SETPO = Set Byte on Not Parity/ParityOdd 


To Register/Memory | 0000111 1 | 10011011.|mod000 r/m 


SETL/SETNGE = Set Byte on Less/Not Greater or Equal 


To Register/Memory | 00001111 | 10011100 |mod000 r/m 


’ | SETNL/SETGE = Set Byte on Not Less/Greater or Equal 


To Register/Memory | 00001111 | 011111014 |mod000 r/m 


| SETLE/SETNG = Set Byte on Less or Equal/Not Greater 


To Register/Memory | 00001111 | 10011110 |mod000 r/m| — 
SETNLE/SETG = Set Byte on Not Less or Equal/Greater 


To Register/Memory | 00001111 | 10011111 |mod000 f/m 


ENTER = Enter Procedure 11001000 | 16-bit displacement, 8-bit level 


L>1 


LEAVE = Leave Procedure 11001001 


| 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


INTERRUPT INSTRUCTIONS 
INT = interrupt: 


Type Specified 


Type 3 


INTO = Interrupt 4 if Overflow Flag Set 


IfOF = 1 
if OF =0 


Bound = Interrupt 5 if Detect Value mod reg r/m 


Out of Range 
If Out of Range e,g,h, j,k, r 
If In Range e, g,h, j, k, r 


Protected Mode Only (INT) 
INT: Type Specified 

Via Interrupt or Trap Gate 

to Same Privilege Level 

Via Interrupt or Trap Gate 


g, j,k, r 


to Different Privilege Level g, j,k, Fr 
From 80286 Task to 80286 TSS via Task Gate g,j,k,r 
From 80286 Task to 386™ DX TSS via Task Gate g, j,k, r 
From 80286 Task to virt 8086 md via Task Gate g, j,k, ¢ 
From 386™ DX Task to 80286 TSS via Task Gate g, j,k, r 
From 386™ DX Task to 386™ DX TSS via Task Gate Dike 
From 386™ Dx Task to virt 8086 md via Task Gate g, j,k, r 
From virt 8086 md to 80286 TSS via Task Gate g,j,k,r 
From virt 8086 md to 386™ DX TSS via Task Gate g,j, k,r 


From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate 


INT: TYPE 3 
Via Interrupt or Trap Gate 
to Same Privilege Level 

Via Interrupt or Trap Gate 


g,j,k,r 


to Different Privilege Level g, j,k, r 
From 80286 Task to 80286 TSS via Task Gate g, j,k, r 
From 80286 Task to 386™ DX TSS via Task Gate g, j,k, r 
From 80286 Task to Virt 8086 md via Task Gate Q,j,k,r 
From 386™ DX Task to 80286 TSS via Task Gate g, j,k, r 
From 386T™ DX Task to 386™ DX TSS via Task Gate g,j,k,r 
From 386™ DX Task to Virt 8086 md via Task Gate g, j,k, r 
From virt 8086 md to 80286 TSS via Task Gate g, j,k, F 
From virt 8086 md to 386™ DX TSS via Task Gate _ 


g, j,k, r 
From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate oe 


INTO: 


Via Interrupt or Trap Grate 
to Same Privilege Level 
Via Interrupt or Trap Gate 


g. j,k, r 


to Different Privilege Level g,j,k,r 
From 80286 Task to 80286 TSS via Task Gate g, i,k, r 
From 80286 Task to 386™ DX TSS via Task Gate g.j, kr 
From 80286 Task to virt 8086 md via Task Gate g,j,k,r 
From 3867 DX Task to 80286 TSS via Task Gate g,j,k,r 
From 386™ DX Task to 386™ DX TSS via Task Gate g,j,k,r 
From 386™ Dx Gate é g, j,k, r 
From virt 8086 md to 80286 TSS via Task Gate g,j, k,r 
From virt 8086 md to 386™ DX TSS via Task Gate g, j,k, r 


From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate | 
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_ Tabie 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) _ 


; a Real Reai 
INSTRUCTION. a FORMAT Address | Protected Address | Protected 
= , _ Mode or Virtual Mode or Virtual 
Virtual | Address Virtual | Address 
8086 Mode 8086 Mode 


Mode Mode 


INTERRUPT INSTRUCTIONS (Continued) 
BOUND: : 


Via Interrupt or Trap Gate 
to Same Privilege Level 

Via Interrupt or Trap Gate 
to Different Privilege Level 

From 80286 Task to 80286 TSS via Task Gate 

From 80286 Task to 386™ DX TSS via Task Gate 

From 80268 Task to virt 8086 Mode via Task Gate 

From 386™ DX Task to 80286 TSS via Task Gate 

From 386™ DX Task to 386™ DX TSS via Task Gate 

From 80368 Task to virt 8086 Mode via Task Gate 

From virt 8086 Mode to 80286 TSS via Task Gate 

From virt 8086 Mode to 386™ DX TSS via Task Gate 

From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate - 


| INTERRUPT RETURN 


IRET = Interrupt Return 11001111 g, h, j, k, r 


Protected Mode Only (IRET) 
To the Same Privilege Level (within task) 
To Different Privilege Level (within task) 
From 80286 Task to 80286 TSS 
| From 80286 Task to 386™ DX TSS 
~ From 80286 Task to Virtual 8086 Task 
From 80286 Task to Virtual 8086 Mode (within task) 


g, h, j, k, r 
g,h,j,k,r 
h, j, k, r 
h, j,k, ¢ 
h, j, k, r 


From 3867™ DX Task to 80286 TSS hj, k, ¢ 
From 386™ DX Task to 386™ DX TSS h, j,k, r 
From 386™ DX Task to Virtual 8086 Task h,i,k,e 


From 386™ DX Task to Virtual 8086 Mode (within task) 
PROCESSOR CONTROL | | 


HLT = HALT sb 44410700 


MOV = Move to and From Control/Debug/Test Registers 


CRO/CR2/CR3fromregister Ss | 00001111 | 00100010 
Register From CRO~3 . 00001111 | 00100000 


eer 


TR6-7 from Register 00001111 | 00100110 1 1 eee reg 


00001111 | 00100100 


Register from TR6-7 


|NOP = No Operation | 40010000 


WAIT = Wait untii BUSY # pin is negated | 10011011 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION FORMAT Address Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


PROCESSOR EXTENSION INSTRUCTIONS 


Processor Extension Escape 11011TTT |modLLL r/m See 


TTT and LLL bits are opcode 80287/80387 


information for coprocessor. data sheets for 
clock counts 


PREFIX BYTES 

Address Size Prefix 01100111 
LOCK = Bus Lock Prefix 11110000 
Operand Size Prefix 01100110 


Segment Override Prefix 


CS: 7 00101110 
Ds: | 00111110 
ES: | 00100110 
FS: 01100100 
GS: 01100101 


ss: | 00110110 


PROTECTION CONTROL 
ARPL = Adjust Requested Privilege Level 


From Register/Memory | 01100011 


LAR = Load Access Rights 


From Register/Memory 00001111 00000010 | modreg r/m 


LGDT = Load Global Descriptor 


Table Register 00001111 00000001 |mod010 r/m 


LIDT = Load Interrupt Descriptor 


Table Register | 00001111 | 00000001 |mod011 = r/m 


LLDT = Load Local Descriptor 


Table Register to : 
Register/Memory 00001111 00000000 |mod010 fr/m 
LMSW = Load Machine Status Word 


From Register/Memory | 00001111 | 00000001 |mod110 r/m 


LSL = Load Segment Limit 


From Register/Memory 00001111 00000011 |modreg = r/m 


Byte-Granular Limit 
Page-Granular Limit 


LTR = Load Task Register 
From Register/Memory 00001111 00000000 | mod011-— r/m . ; g,h, j, | 
SGDT = Store Global Descriptor 


Table Register 00001111 | 00000001 |mod000 f/m 


SIDT. = Store Interrupt Descriptor 
Table Register 00001111 | 00000001 |mod001 r/m 


SLDT = Store Local Descriptor Table Register 


To Register/Memory 00001111 00000000 |mod000 f/m 
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Table 6-1. 386™ DX Instruction Set Clock Count Summary (Continued) 
| 


22 dees > Oe ; Real Real 
INSTRUCTION FORMAT Address | Protected | Address | Protected 
: , Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
_ Mode ; Mode 


SMSW = Store Machine 


Status Word 00001111 00000001 |mod100 r/m 


STR = Store Task Register a ; , 
To Register/Memory 00001111 | 00000000 |mod001 r/m| 


= Verify Read Accesss 


Register/Memory 00001111 00000000 ;|mod100 r/m 
VERW = Verify Write Accesss 00001111 00000000 |mod101 = r/m 


INSTRUCTION NOTES FOR TABLE 6-1 


Notes a through c apply to 386 DX Real Address Mode only: 

a. This is a Protected Mode instruction. Attempted execution in Real Mode will result in exception 6 (invalid opcode). 

b. Exception 13 fault (general protection) will occur in Real Mode if an operand reference is made that partially or fully 
extends beyond the maximum CS, DS, ES, FS or GS limit, FFFFH. Exception 12 fault (stack segment limit violation or not 
present) will occur in Real Mode if an operand reference is made that partially or fully extends beyond the maximum SS limit. 
c. This instruction may be executed in Real Mode. In Real Mode, its purpose is primarily to initialize the CPU for Protected 
Mode. 


Notes d through g apply to 386 DX Real Address Mode and 386 DX Protected Virtual Address Mode: 
d. The 386 DX uses an early-out multiply algorithm. The actual number of clocks depends on the position of the most 
significant bit in the operand (multiplier). 
Clock counts given are minimum to maximum. To calculate actual clocks use the following formula: 
Actual Clock = ifm < > 0 then max ([logo |ml], 3) + b clocks: 
ifm = 0 then 3+b clocks 
In this formula, m is the multiplier, and. 
b = 9 for register to register, 
b = 12 for memory to register, 
b = 10 for register with immediate to register, 
_ b = 11 for memory with immediate to register. 
e. An exception may occur, depending on the value of the operand. 
f. LOCK # is automatically asserted, regardiess of the presence or absence of the LOCK# prefix. 
g. LOCK # is asserted during descriptor table accesses. 


Notes h through r apply to 386 DX Protected Virtual Address Mode only: 

h. Exception 13 fault (general protection violation) will occur if the memory operand in CS, DS, ES, FS or GS cannot be used 
due to either a segment limit violation or access rights violation. If a stack limit is violated, an exception 12 (stack segment 
limit violation or not present) occurs. 

i. For segment load operations, the CPL, RPL, and DPL must agree with the privilege rules to avoid an exception 13 fault 
(general protection violation). The segment’s descriptor must indicate “present” or exception 11 (CS, DS, ES, FS, GS not 
present). If the SS register is loaded and a stack segment not present is detected, an exception 12 (stack segment limit 
violation or not present) occurs. 

j. All segment descriptor accesses in the GDT or LDT made by this instruction will automatically assert LOCK # to maintain 
descriptor integrity in multiprocessor systems. 

k. JMP, CALL, INT, RET and IRET instructions referring to another code segment will cause an exception 13 (general 
protection violation) if an applicable privilege rule is violated. 

|. An exception 13 fault occurs if CPL is greater than 0 (0 is the most privileged level). 

m. An exception 13 fault occurs if CPL is greater than IOPL. 

n. The IF bit of the flag register is not updated if CPL is greater than IOPL. The IOPL and VM fields of the flag register are 
updated only if CPL = 0. 

o. The PE bit of the MSW (CRO) cannot be reset by this instruction. Use MOV into CRO if desiring to reset the PE bit. 

p. Any violation of privilege rules as applied to the selector operand does not cause a protection exception; rather, the zero 
flag is cleared. 

q. If the coprocessor’s memory operand violates a segment limit or segment access rights, an exception 13 fault (general 
protection exception) will occur before the ESC instruction is executed. An exception 12 fault (stack segment limit violation 
or not present) will occur if the stack limit is violated by the operand’s starting address. ; 
r. The destination of a JMP, CALL, INT, RET or IRET must be in the defined limit of a code segment or an exception 13 fault 
(general protection violation) will occur. 
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6.2 INSTRUCTION ENCODING 


6.2.1 Overview 


All instruction encodings are subsets of the general 
instruction format shown in Figure 6-1. Instructions 
consist of one or two primary opcode bytes, possibly 
an address specifier consisting of the ‘mod r/m” 
byte and “scaled index” byte, a displacement if re- 
quired, and an immediate data field if required. 


Within the primary opcode or opcodes, smaller en- 
coding fields may be defined. These fields vary ac- 
cording to the class of operation. The fields define 
such information as direction of the operation, size 
of the displacements, register encoding, or sign ex- 
tension. 


Almost all instructions referring to an operand in 
memory have an addressing mode byte following 
the primary opcode byte(s). This byte, the mod r/m 
byte, specifies the address mode to be used. Certain 
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encodings of the mod r/m byte indicate a second 
addressing byte, the scale-index-base byte, follows 
the mod r/m byte to fully specify the addressing 
mode. 


Addressing modes can include a displacement im- 
mediately following the mod r/m byte, or scaled in- 
dex byte. If a displacement is present, the possible 
sizes are 8, 16 or 32 bits. 


If the instruction specifies an immediate operand, 
the immediate operand follows any displacement 
bytes. The immediate operand, if specified, is always 
the last field of the instruction. 


Figure 6-1 illustrates several of the fields that can 
appear in an instruction, such as the mod field and 
the r/m field, but the Figure does not show all fields. 
Several smaller fields also appear in certain instruc- 
tions, sometimes within the opcode bytes them- 
selves. Table 6-2 is a complete list of all fields ap- 
pearing in the 386 DX instruction set. Further ahead, 
following Table 6-2, are ustaied tables for each 
ie 


TTTTTTTT|TTTTTTTT| mod TTT r/m| ss index base |d32 | 16 | 8 | none data32 | 16 | 8 | none 


0,765320 


765320 


a ee 


opcode “mod r/m” 
(one or two bytes) . byte | 
(T represents an 
opcode bit.) 


**g-j-b” 


register and address 


address _ immediate 

byte. displacement data 

(4, 2, 1 bytes (4, 2, 1 bytes 
or none) or none) 


mode specifier 


Figure 6-1. General Instruction Format 


Table 6-2. Fields within 386™ DX Instructions 


_ reg 
mod r/m 


General Register Specifier 


Ss 


tttn 
or a Condition Negated 


Note: Table 6-1 shows encoding of individual instructions. 


Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits 
| Specifies Direction of Data Operation 

Specifies if an Immediate Data Field Must be Sign-Extended 
Address Mode Specifier (Effective Address can be a General Register) 


Scale Factor for Scaled Index Address Mode 


index General Register to be used as Index Register. 

base General Register to be used as Base Register 

sreg2 Segment Register Specifier for CS, SS, DS, ES 

sreg3 Segment Register Specifier for CS, SS, DS, ES, FS, GS » 


For Conditional Instructions, Specifies a Condition Asserted 


2 for mod; 
3 for r/m | 
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6.2.2 32-Bit Extensions of the 
| Instruction Set 


With the 386 DX, the 8086/80186/80286 instruction 
set is extended in two orthogonal directions: 32-bit 
forms of.all 16-bit instructions are added to support 
the 32- bit data types, and 32-bit addressing modes 
are made available for all instructions referencing 
memory. This orthogonal instruction set extension is 
accomplished having a Default (D) bit in the code 
segment descriptor, and by having 2 prefixes to the 
instruction set. 


Whether the instruction defaults to operations of 16 
bits or 32 bits depends on the setting of the D bit in 
the code segment descriptor, which gives the de- 
fault length (either 32 bits or 16 bits) for both oper- 


ands and effective addresses when executing that - 


code segment. In the Real Address Mode or Virtual 
8086 Mode, no code segment descriptors are used, 
but a D value of 0 is assumed internally by the 386 
DX when operating in those modes (for 16-bit de- 
fault sizes compatible with the 8086/80186/80286). 


Two prefixes, the Operand Size Prefix and the Effec- 
tive Address Size Prefix, allow overriding individually 
the Default selection of operand size and effective 
address size. These prefixes may precede any op- 


code bytes and affect only the instruction they pre- _ 
cede. If necessary, one or both of the prefixes may | 


be placed before the opcode bytes. The presence of 
, the Operand Size Prefix and the Effective Address 
Prefix will toggle the operand size or the effective 
address size, respectively, to the value ‘‘opposite”’ 
from the Default setting. For example, if the default 
operand size is for 32-bit data operations, then pres- 
ence of the Operand Size Prefix toggles the instruc- 
tion to 16-bit data operation. As another example, if 
the default effective address size is 16 bits, pres- 
ence of the Effective Address Size prefix toggles the 
instruction to use 32-bit eiiecive address computa- 
tions. ) | 


These 32-bit extensions are available in all 386 DX: 
modes, including the Real Address Mode or the Vir- | 


tual 8086 Mode. In these modes the default is al- 
ways 16 bits, so prefixes are needed to specify 32- 


bit operands or addresses. For instructions with 


more than one prefix, the order of prefixes is unim- 
~ portant. 


Unless specified otherwise, instructions with 8-bit 


and 16-bit operands do not affect the contents of _ 


the high-order bits of the extended registers. 


6.2.3 Encoding of Instruction Fields 


Within the instruction are several fields indicating 
register selection, addressing mode and so on. The 
exact encodings of these fields are defined immedi- 
ately ahead. 


oO. 
oe 
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6.2.3.1 ENCODING OF OPERAND LENGTH (w) | 
FIELD © 


For any given instruction performing a data opera- 
tion, the instruction is executing as a 32-bit operation 

-or a 16-bit operation. Within the constraints of the 
operation size, the w field encodes the operand size > 
as either one byte or the full operation size, as 
shown in the table below. . 


Operand Size Operand Size 
- During 16-Bit During 32-Bit 


Data Operations | Data Operations 
 BBits 8 Bits 
16 Bits 32 Bits | 


6.2.3.2 ENCODING OF THE GENERAL 
REGISTER (reg) FIELD 


w Field 


The general register is specified by the reg field, 
which may appear in the primary opcode bytes, or as 


the reg field of the “mod r/m” byte, or as the r/m 


field of the “mod r/m” byte. 


Encoding of reg Field When w Field 
is not Present in instruction 


Register Selected | Register Selected 
During 32-Bit 
Data Operations 


reg Field; During 16-Bit 


Data Operations 


Encoding of reg Field When w Field 
is Present in Instruction 


Register Specified by reg Field 
| During 16-Bit Data Operations: 


Function of w Field 


[when w = 0) | _(whenw = 1) 
AL AX 
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Function of w Field 
(when w = 1) 


| Register Specified by reg Field 
During 32-Bit Data Operations 


6.2.3.3 ENCODING OF THE SEGMENT 
REGISTER (sreg) FIELD 


The sreg field in certain instructions is a 2-bit field 
allowing one of the four 80286 segment registers to 
be specified. The sreg field in other instructions is a 
3-bit field, allowing the 386 DX FS and GS segment 
registers to be specified. 


2-Bit sreg2 Field 


Segment 
Register 
Selected 


2-Bit 
sreg2 Field 


3-Bit sreg3 Field 


Segment 
Register 
Selected 


3-Bit 
sreg3 Field 


do not use 
do not use 
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6.2.3.4 ENCODING OF ADDRESS MODE 


Except for special instructions, such as PUSH or 
POP, where the addressing mode is pre-determined, 
the addressing mode for the current instruction is 
specified by addressing bytes following the primary 
opcode. The primary addressing byte is the “mod 
r/m” byte, and a second byte of addressing informa- 
tion, the ‘“s-i-b’ (Scale-index-base) byte, can be 
specified. | 


The s-i-b byte (scale-index-base byte) is specified 
when using 32-bit addressing mode and the ‘‘mod 
r/m” byte has r/m = 100 and mod = 00, 01 or 10. 
When the sib byte is present, the 32-bit addressing 
mode is a function of the mod, ss, index, and base 
fields. 


The primary addressing byte, the ‘mod r/m’’ byte, 
also contains three bits (shown as TTT in Figure 6-1) 
sometimes used as an exiension of the primary op- 
code. The three bits, however, may also be used as 
a register field (reg). 


When calculating an effective address, either 16-bit 
addressing or 32-bit addressing is used. 16-bit ad- 
dressing uses 16-bit address components to calcu- 
late the effective address while 32-bit addressing 
uses 32-bit address components to calculate the ef- 
fective address. When 16-bit addressing is used, the 
“mod r/m”’ byte is interpreted as a 16-bit addressing 
mode specifier. When 32-bit addressing is used, the 
“mod r/m”’ byte is interpreted as a 32-bit addressing . 
mode specifier. 


Tables on the following three pages define all en- | 
codings of all 16-bit addressing modes and 32-bit 
addressing modes. 
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Encoding of 16-bit Address Mode with “mod r/m” Byte 


00 000 
00 001 
00 010 
00 011 


00 100 | 


00 101 
~ 00110 
00 111 


01 000 


01 001 
01 010 
01011 
01 100 
01 101 
01 110 
01 111 


Effective Address 


- DS:[BX+ SI] 


DS:[BX + Di] 
SS:[BP + SI] 


DS:{S!] 
DS:[D!] 


DS:d16. 


DS: [Bx] 


-SS:[BP + DI] 


- DS:[BX + SI+ d8] 
DS:[BX + DI+ d8] 
SS:[BP + SI+ d8] 
SS:[BP4 DI+ d8] 


— DS:[SI+d8] _ 


DS:[DI+ d8] 


- SS:[BP + d8] 


DS:[BX + d8] 


Effective Address 


DS:[BX+ SI+d16] 
-DS:[BX + DI+ d16] 
SS:[BP + SI+ d16] 
SS:[BP + Di+ d16] 
DS:[SI + d16] 
DS:[DI+d16] — 
SS:[BP + d16] 
DS:[BX + d16] 


register—see below 
register—see below 
register—see below 
register—see below 
-register—see below 
register—see below 
register—see below 
register—see below 


Register Specified by r/m 


Duri 


er r/m 
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, '~ _- Register Specified by r/m 
During 32-Bit Data Operations 


ng 16-Bit Data Operations 


Function of w Field 
omen -_ w=) 


Function of w Field 
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Encoding of 32-bit Address Mode with “mod r/m” byte (no “s-i-b” byte present): 


Effective Address 


DS:[EAX] 
DS:[ECX] 
DS:[EDX] 
DS:[EBX] 

s-i-b is present 
DS:d32 
DS:[ESI] 
DS:[EDI] 


DS:[EAX + d32] 
DS:[ECX + d32] 
DS:[EDX + d32] 
DS:[EBX + d32] 

s-i-b is present 
SS:[EBP + d32] 
DS:[ESI+ d32] _ 
DS:[EDI+ d32] 


_ DS:[EAX+d8] 
DS:[ECX + d8] 
DS: [EDX + d8] 

_DS:[EBX + d8] © 


register—see below 
register—see below 
register—see below 
register—see below 

s-i-b is present register—see below 
SS:[EBP + d8] | register—see below 
DS:[ESI + d8] register—see below 
DS:[EDI+ d8] register—see below 


Register Specified by reg or r/m 
during 16-Bit Data Operations: 
function of w field | 
— modr/m 
| (whenw=0) | (whenw=1) 
AL AX 


Register Specified by reg or r/m 
during 32-Bit Data Operations: 


function of w field . 
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Encoding of 32-bit Address Mode (“mod r/m” byte and “s-i-b” byte present): 


Effective Address | 


| DS:[EAX + (scaled index)] 
DS:[ECX + (scaled index)] 
DS:[EDX + (scaled index)] 
-DS:[EBX + (scaled index)] 
SS:[ESP + (scaled index)] 
DS:[d32 + (scaled index)] 
DS:[ESI + (scaled index)] © 
DS:[ED!I + (scaled index)] 


DS:[EAX + (scaled index) 
DS:[ECX + (scaled index) 
| DS:[EDX + (scaled index) 
DS:[EBX + (scaled index) 
| SS:[ESP + (scaled index) 
SS:[EBP + (scaled index) 
DS:[ESI + (scaled index) 
DS:[EDI + (scaled index) 


t+++H++4+4++4 


DS:[EAX + (scaled index) 
DS:[ECX + (scaled index) 
DS:[EDX + (scaled index). 
DS:[EBX + (scaled index) 
SS:[ESP + (scaled index) 
SS:[EBP + (scaled index) 


DS:{ESI 
DS: {EDI 


+ (scaled index) . 
+ (scaled index) 


++etetet+ 


NOTE: | oo 
Mod field in “mod r/m” byte; ss, index, base fields in 
“s-i-b” byte. . | 


| ss | Scale Factor _ 
/ 00 4 


Index Register | 
EAX 
ECX. 
EDX 
EBX 


no index reg** 
EBP 
ESI 
EDI 


**IMPORTANT NOTE: 

When index field is 100, indicating ‘‘no index register,” then 
ss field MUST equal 00. If index is 100 and ss does not. 
equal 00, the effective address is undefined. 
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6.2.3.5 ENCODING OF OPERATION DIRECTION 
(d) FIELD 


In many two-operand instructions the d field is pres- 
ent to indicate which operand is considered the 
source and which is the destination. 


ds Direction of Operation 


Register/Memory <- - Register 
“reg” Field Indicates Source Operand; 
“mod r/m’”’ or ‘“‘mod ss index base’”’ Indicates 


Destination Operand 


Register <- - Register/Memory 

“reg” Field Indicates Destination Operand; 
“mod r/m”’ or ‘“‘mod ss index base” Indicates 
Source Operand 


6.2.3.6 ENCODING OF SIGN-EXTEND (s) FIELD 


The s field occurs primarily to instructions with im- 
mediate data fields. The s field has an effect only if 
the size of the immediate data is 8 bits and is being 
placed in a 16-bit or 32-bit destination. 


Effect on | . Effect on. 
Immediate Data8 _ [Immediate Data 16/32 


1 Sign-Extend Data8 to Fill 
16-Bit or 32-Bit Destination). 


6.2.3.7 ENCODING OF CONDITIONAL TEST 
(tttn) FIELD 


For the conditional instructions (conditional jumps 

and set on condition), tttn is encoded with n indicat- 

ing to use the condition (n= 0) or its negation (n= 1), 
and ttt giving the condition to test. 
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No Overflow 

Below/Not Above or Equal 
Not Below/Above or Equal 
Equal/Zero 

Not Equal/Not Zero 

Below or Equal/Not Above 


Not Below or Equal/Above 


Parity/Parity Even 

Not Parity/Parity Odd 

Less Than/Not Greater or Equal 
Not Less Than/Greater or Equal 


6.2.3.8 ENCODING OF CONTROL OR DEBUG 
OR TEST REGISTER (eee) FIELD 


For the loading and storing of the Control, Debug 
and Test registers. 


When Interpreted as Control Register Field | 


| eeeCode | RegName | 


000 CRO 
010 CR2 
011 CR3 


Do not use any other encoding 


Do not use any other encoding 


When Interpreted as Test Register Field 


110 — TRE. 
111 | TR7 | 


Do not use any other encoding 
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Figure 7-1. Processor Module Dimensions 


7. DESIGNING FOR ICET™-386 DX 
EMULATOR USE 


The 386 DxX in-circuit emulator products are ICE-386 
DX 25 MHz or 33 MHz (both referred to as ICE-386 
- DX emulator). The ICE-386 DX emulator probe mod- 
_ ule has-several electrical and mechanical character- 


_. istics that should be taken into consideration when 


designing the hardware. 


Capacitive loading: The ICE-386 DX emulator adds 
up to 25 pF to each line. | 


Drive requirement: The ICE-386 DX emulator adds 
one standard TTL load on the CLK2 line, up to one 
advanced low-power Schottky TTL load per control 
signal line, and one advanced low-power Schottky 
_TTL load per address, byte enable, and data line. 
- These loads are within the probe module and are 
driven by the probe’s 386 DX component, which has 
standard drive and loading capability listed in the 
A.C. and D.C. SPeeication Tables in Sections 9.4 
and 9.5. 


Power requirement: For noise immunity the ICE- 
386 DX emulator probe is powered by the user sys- 
tem. This high-speed probe circuitry draws up to 
1.5A plus the maximum Icc from the user 386 DX 
component socket. j 


386 DX location and orientation: The ICE-386 DX 
processor module, target-adaptor cable (which does 
not exist for the ICE-386 DX 33 MHz emulator), and 
the isolation board used for extra electrical buffering 
of the emulator initially, require clearance as. illustrat- 
ed in Figures 7-1 and 7-2. | 


Interface Board and CLK2 speed reduction: 
When the ICE-386 DX emulator probe is first: at- 
tached to an unverified user system, the interface 
board helps the ICE-386 DX emulator function in 
user systems with bus faults (shorted signals, etc.). 
After electrical verification it may be removed. Only 
when the interface board is installed, the user sys- 
tem must have a reduced CLK2 frequency of 25 
MHz maximum. 


Cache coherence: The ICE-386 DX emulator loads 
user memory by performing 386 DX component 
write cycles. Note that if the user system is not de- 
signed to update or invalidate its cache (if it has a 
cache) upon processor writes to memory, the cache 
could contain stale instruction code and/or data. For 
best use of the ICE-386 DX emulator, the user 
should consider designing the cache (if any) to up- 
date itself automatically when processor writes oc- 
cur, or find another method of maintaining cache 
data coherence with main user memory. 
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Figure 7-2. Processor Module, Target-Adapter Cable, and Isolation Board Dimensions 
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8. MECHANICAL DATA 8.2 PACKAGE DIMENSIONS AND 

Oo “ge MOUNTING eS 
8.1 INTRODUCTION The initial 386 DX package is a 132-pin ceramic pin 
er ae grid array (PGA). Pins of this package are arranged 
In this section, the physical packaging and its con- — 100 inch (2.54mm) center-to-center, in a 14 x 14 
nections are described in detail. matrix, three rows around. _ 


A wide variety of available sockets allow low inser- 
tion force or zero insertion force mountings, and a 
choice of terminals such as _ soldertail, surface 
mount, or wire wrap. Several applicable sockets are 
listed in Table 8.1. 


150 (3.807) 
.250 (6.345) 
.350 (8.883) 
450 (11.421) 
.550 (13.959) 
.650 (16.497) 

— .725 (18.401) 


P | 057 (1.269) 
= 725 (18.401) 
.650 (16.497) 
550 (13.959) 
450 (11.421) 
.350 (8.883) 
.250 (6.345) ~ 
150 (3.807) 
050 (1.269) 


age 
MAAABREABEBBE 
ee add 
LEEW Beee=s 


©OOQOQOOO 


2 
‘S 
4 
5 
6 
7 
8 
9 


.001 (0.025) R 


i” we wf wt ff ft ft te tt tf te ee 


SWEDGE PIN 
STANDOFF 018 (0,47). 
(4) PLACES DIA TYP JY 


I MOXOKOLOROROLO 


aga 
A~ABRABaAABABAa’ 


Yr 
y 
y 
| 
i 


\ 
\ 
\ 
\ 
\ 
Ne 
\ 
\ 
\ 
N 


020 (0.508) | 165 (4.189) 
MIN TYP 
070 (1.777) DIA , 110 (2.792) 
TYP BRAZE PAD 


— 1.450 (36.802) 
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Figure 8.1. 132-Pin Ceramic PGA Package Dimensions a 
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Table 8.1. Several Socket Options for 132-Pin PGA 


* Low insertion force (LIF) soldertail 
55274-1 

* Amp tests indicate 50% reduction in insertion 
force compared to machined sockets 


Other socket options 

* Zero insertion force (ZIF) soldertail 
55583-1 

* Zero insertion force (ZIF) Burn-in version 
55573-2 


Amp Incorporated 
(Harrisburg, PA 17105 U.S.A. 
Phone 717-564-0100) 


231630-45 
Cam handle locks in low profile position when substrate is installed (handle UP for 
open and DOWN for closed positions) 


courtesy Amp Incorporated 


Peel-A-WayT Mylar and Kapton _Peel-A-Way Carrier No. 132: | __SOLDERTAIL-01_ _| LOWPROFILE-o4 | _PRESSFIT-05 _| 


Socket Terminal Carriers Kapton Carrier is KS132 
Low insertion force surface Mylar Carrier is MS132 


mount CS132-37TG = Molded Plastic Body KS132 
Low insertion force soldertail is shown below: 


CS132-01TG 


Low insertion force wire-wrap 
CS132-02TG (two level) 
CS132-03TG (three-level) 


Low insertion force press-fit ” 


Advanced interconnections 
(5 Division Street 
Warwick, RI 02818 U.S.A. 
Phone 401-885-0485) 
9.14 -02 
300 2 LEVEL 


12.70 93 
14x 14x 3 ROWS fl ee 
x x 


231630-46 


231630-—47 
courtesy Advanced Interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 
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Table 8.1. Several Socket Options for 132-Pin PGA (Continued) 


PIN GRID ARRAY | VisinPak Kapton Carrier ~~ A: Soldertail B: Soldertail 
DECOUPLING SOCKETS 


~ * Low insertion force soldertail 
0.125 length PGD-005-1A1- 
Finish: Term/Contact  Tin- 


Lead/Gold PGM (Plastic) or PPS 
Low insertion force soldertail (Glass Epoxy) Series 
0.180 length PGD-005-1B1 
Finish: Term/Contact:  Tin- 
Lead/Gold . 
Low insertion 3 level Wire/ 
Wrap PGD-005-1C1_ Finish: 
Term/Contact Tin-Lead/Gold 
ee 


Includes 0.10 uF & 1.0 pF 
Decoupling Capacitors 
é - : | 0.020 


AUGAT INC. , | | : 
33 Perry Ave., P.O. Box 779 Attleboro, MA 02703 C: Soldertail 1.450 0.020 
TECHNICAL INFORMATION: (508) 222-2202 
CUSTOMER SERVICE: (508) 699-9800 


i 


PKC Series 
Pin Grid Array 


a |. 0.100 7VP. 
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Low. insertion force socket soldertail 
(for production use) 
2XX-6576-00-3308 (new style) 
2XX-6003-00-3302 (older style) 


Zero insertion force soldertail 
(for test and burn-in use) 
2XX-6568-00-3302 


Textool Products | 

Electronic Products Division/3M 
(1410 West Pioneer Drive 
Irving, Texas 75601 U.S.A. 
Phone 214-259-2676) 


courtesy Textool Products/ aM 231630-—48 
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8.3 PACKAGE THERMAL : The PGA case temperature should be measured at 
SPECIFICATION the center of the top surface opposite the pins, as in 
Figure 8.2. 


The 386 DX is specified for operation when case 
temperature is within the range of 0°C-85°C. The 
case temperature may be measured in any environ- 
ment, to determine whether the 386 DX is within 
specified operating range. 


MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 


132=—PIN PGA 


231630-36 


Figure 8.2. Measuring 386™ DX PGA Case Temperature 


Table 8.2. 386™ DX PGA Package Thermal Characteristics 


Thermal Resistance — °C/Watt 
Airflow — ft./min (m/sec) 
Parameter 


6 Junction-to-Case 
(case measured 
as Fig. 8-2) 

6 Case-to-Ambient 
(no heatsink) 


@ Case-to-Ambient 
(with omnidirectional 
heatsink) 


231630-72 
_6 Case-to-Ambient 


(with unidirectional 
heatsink) 


NOTES: 

1. Table 8.2 applies to 386™ DX PGA 3. 8j.cap = 4°C/w (approx.) 

plugged into socket or soldered directly 0)-piIn = 4°C/w (inner pins) (approx.) 

into board. 0)-pin = 8°C/w (outer pins) (approx.) 

2. Oya = Ajo + Oca. 4.Ta = Tc — P * Oca (ambient temperature) 
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9.1 INTRODUCTION 


The following sections describe recommended elec- 
trical connections for the 386 DX, and its electrical 
specifications. 


9.2 POWER AND GROUNDING 


9.2.1 Power Connections 


The 386 DX is implemented in CHMOS Ill and 
CHMOS IV technology and has modest power re- 
quirements. However, its high clock frequency and 
72 output buffers (address, data, control, and HLDA) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 20 Voc 
and 21 Vss pins separately feed functional units of 
the 386 DX. 


Power and ground connections must be made to all 
external Vcc and GND pins of the 386 DX. On the 
circuit board, all Vcc pins must be connected on a 
Vcc plane. All Vss pins must be likewise connected 
on a GND plane. 


9.2.2 Power Decoupling 
Recommendations 


Liberal decoupling capacitance should be placed 


near the 386 DX. The 386 DX driving its 32-bit paral- 
lel address and data buses at high frequencies can 
cause transient power surges, particularly when driv- 
ing large capacitive loads. — 


Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
circuit board traces between the 386 DX and 


386™ DX MICROPROCESSOR 


decoupling capacitors as much as possible. Capaci- 
tors specifically for PGA packages are also commer- 
cially available, for the lowest possible inductance. 


9.2.3 Resistor Recommendations — 


The ERROR # and BUSY # inputs have resistor pull- 


ups of approximately 20 KO. built-in to the 386 DX to 
keep these signals. negated when no 387 DX co- 
processor is present in the system (or temporarily 
removed from its socket). The BS16# input also has 
an internal pullup resistor of approximately 20 KO, 
and the PEREQ input has an internal pulldown resis- 
tor of approximately 20 KN. 


In typical designs, the external pullup resistors 


- shown in Table 9-1 are recommended. However, a 


particular design may have reason to adjust the re- 
sistor values recommended here, or alter the use of 
pullup resistors in other ways. 


9.2.4 Other Connection 
Recommendations 


For reliable operation, always connect unused in- 
puts to an appropriate signal level. N.C. pins should 
always remain unconnected. 


‘Particularly when not using interrupts or bus hold, 


(as when first prototyping, perhaps) prevent any 
chance of spurious activity by connecting these as- 


sociated inputs to GND: 


_ Pin Signal 
B7 INTR 
B8 NMI 
D14 HOLD 


If not using address pipelining, plu D1i3 NA# to 
Vcc: 


If not using 16-bit bus size, pullup C14 BS16# to 
Voc: 


Pullups in the range of 20 KO. are recommended. 


Table 9-1. Recommended Resistor ee to Vcc 


Pin and Signal Pullup Value 


| E14 ADS# 20 KX +10% 
C10 LOCK# 20 KX +10% 


Lightly Pull ADS # Negated 
During 386™ DX Hold 
Acknowledge States 


Lightly Pull LOCK # Negated 
During 386™ DX Hold 
Acknowledge States 
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9.3 MAXIMUM RATINGS _ Table 9-2 is a stress rating only, and functional oper- 
: | ation at the maximums is not guaranteed. Functional 
Table 9-2. Maximum Ratings operating conditions are given in 9.4 D.C. Specifica- 


tions and 9.5 A.C. Specifications. 

386™ DX 

Parameter 20, 25, 33 MHz Extended exposure to the Maximum Ratings may af- 

- Maximum Rating fect device reliability. Furthermore, although the 386 

Storage Temperature 65°C to + 150°C DX contains protective circuitry to resist damage 

C . fos (nGeE EE 65°C to +140°C from static electric discharge, always take precau- 
scaler a ctigl can lied eee © tions to avoid high static voltages or electric fields. 

Supply Voltage with Respect to Vss| —0.5V to +6.5V | 


Voltage on Other Pins —0.5V to Voc + 0.5V 


9.4 D.C. SPECIFICATIONS 
Functional Operating Range: Vcc = 5V +5%; Tcase = O°C to 85°C 


Table 9-3. 386T™ DX D.C. Characteristics 


* — 386T™™ DX 
20 MHz, 25 MHz, Test 
Parameter 33 MHz Conditions 
'Vit____| Input Low Voltage | -o3 [| o8 |v 
Vin _| Input High Voltage | 20 |Voc+03| Vv {| Cs 


ViL 
Vin : 
Vic | CLK2 Input Low Voltage | 0.3 | 
VIHC CLK2 Input High Voltage 
— 20 MHz Voc — 0.8] Voc + 0.3 
25 MHz and 33 MHz | . 3.7 Voc + 0.3 
VoL Output Low Voltage - 
lo. = 4mA: A2-A31, DO-D31 0.45 
0.45 
VOH 
| lo = 1 mA: A2-A31, DO-D31 2.4 
loH = 0.9 mA: BEO# -BE3#, W/R#, 2.4 
D/C#, M/IO#, LOCK#, ADS#, HLDA 
lu 
NH 
loc 
Cin 


ViLC 


Output High Voltage 


0 i0i333 = = = 


D/C#, M/IO#, LOCK#, ADS#, HLDA 
a 
Input Leakage Current Vin = 2.4V (Note 2) 
| (PEREQ Pin) | | | 
A | Vi, = 0.45 (Note 3) 


Input Leakage Current 
0.45V < Vout = Vcc 


| 
aS 
© 


(BS16#, BUSY #, and ERROR # Pins) 


lo. = 5mA: BEO# -BE3#, W/R#, 
| Input Leakage Current 15 OV < Vin < Voc 
(For All Pins except BS16#, PEREQ, 
BUSY #, and ERROR #) 
15 


Output Leakage Current : 


es ee 
Supply Current | 

CLK2 = 40 MHz: with 20 MHz 386™ DX 

CLK2 = 50 MHz: with 25 MHz 386™ DX 

CLK2 = 66 MHz: with 33 MHz 386™ DX 
Cin ee 
[Court _| eel 
tad 


Fr 
> 


Icc Typ. = 460 mA 
loc Typ. = 500 mA 
Icc Typ. = 400 mA 


F | Fo = 1 MHz (Note 4) 


Input or 1/O Capacitance | 


© 

O 

Cc 

4 
nog 
—j;-—)}o% O10 
w;,O;ooo 


Output Capacitance F | Fo = 1 MHz (Note 4) 
CLK2 Capacitance Fo = 1 MHz (Note 4) 


NOTES: 

1. The min value, —0.3, is not 100% tested. 

2. PEREQ input has an internal pulldown resistor. 
3. BS16#, BUSY # and ERROR # inputs each have an internal pullup resistor. 
4. Not 100% tested. 
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9.5 A.C. SPECIFICATIONS 


9.5.1 A.C. Spec Definitions 

The A.C. specifications, given in Tables 9-4, 9-5, and 
9-6, consist of output delays, input setup require- 
ments and input hold requirements. All A.C. specifi- 


cations are relative to the CLK2 rising edge crossing 
the 2.0V level. | , 


A.C. spec measurement is defined by Figure 9-1. In- 
puts must be driven to the voltage levels indicated 
by Figure 9-1 when A.C. specifications are mea- 
sured. 386 DX output delays are specified with mini- 
mum and maximum limits, measured as shown. The 


minimum 386 DX delay times are hold times - 


cLK2 | 


OUTPUTS 7 
| (A2—A31,D/C#, BEO#=BES#, L OAD 1.5V) 
| ADS#,M/1O#, W/R#, LOCK#,HLDA) ~ * 


OUTPUTS 
(DO-D31). 


INPUTS | 
(NA#, BS164, 
INTR, NMI) 


INPUTS 
(READY#, HOLD, BUSY#, 
ERROR#, PEREQ, DO-D31) 


LEGEND: 


~ 386™ DX MICROPROCESSOR 


provided to external circuitry. 386 DX input setup 


and hold times are specified as minimums, defining 


the smallest acceptable sampling window. Within 
the sampling window, a synchronous input signal 
must be stable for correct 386 DX operation. 


Outputs NA#, W/R#, D/C#,; M/IO#, LOCK#, 
BEO#-BE3#, A2—A31 and HLDA only change at 
the beginning of phase one. DO-D31 (write cycles) 
only change at the beginning of phase two. The 


READY #, HOLD, BUSY#, ERROR#, PEREQ and 
DO-D31 (read cycles) inputs are sampled at the be- 


ginning of phase one. The NA#, BS16#, INTR and 
NMI inputs are sampled at the beginning of phase 
two. | 


@- MAXIMUM OUTPUT DELAY SPEC. 
@©- MINIMUM OUTPUT DELAY SPEC. 
©-~ MINIMUM INPUT SETUP SPEC. 
@- MINIMUM INPUT HOLD SPEC. 


-231630-37 


NOTES: 
1. Input waveforms have tr < 2.0 ns from 0.8V to 2. ov. 
2. See section 9.5.8 for typical output rise time versus load capacitance. 


Figure 9-1. Drive Levels and Measurement Points for A.C. Specifications 
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Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


Table 9-4. 33 MHz 386T DX A.C. Characteristics 


Operating Frequency 


CLK2 Period 
CLK2 High Time 


CLK2 High Time 


=k 
on 
oO 


pn |@ 
Ne) 
Oran 


CLK2 Low Time ~—6©6.25 
CLK2 Low Time y 


BAN 
on 
Oo 
~~ wh | AD | ms 


t 
t 
t 
t CLK2 Fall Time 


CLK2 Rise Time 
A2-A31 Valid Delay 


- > if 


A2-A31 Float Delay | 
BEO# -BE3#, LOCK# Valid Delay 
BEO#-BE3#, LOCK# Float Delay 


t10° W/R#, M/IO#, D/C#, Valid Delay 
t10a ADS # Valid Delay 
t11 W/R#,M/lO#, D/C#, ADS# Float Delay 


| t20 READY # Hold Time | 


1 
t2a 
2b. 
t3a 
3b 
4 
tS 
t6 
t7 
t12 
t13 
t14 
t16 
t18 
t19 


Nm | — NO | Mh 
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9.5.2 A.C. Specification Tables (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


Table 9-4. 33 MHz 386™ DX A.C. Characteristics (Continued) 


33 MHz © 
- Parameter 386™ DX 


PEREQ, ERROR #, BUSY # Setup Time 
PEREQ, ERROR #, BUSY # Hold Time 


| Min 
Les 
= 
pe 
p58 
pe 
fF 
a 


NMI, INTR Hold Time 


NOTES: 


1. Float condition occurs when maximum output current becomes less than I_o in magnitude. Float delay is not 100% 
tested. ; . ae | 
2. These inputs are allowed to be asynchronous to CLK2. The setup and hold specifications are given for testing purposes, — 
“to assure recognition within a specific CLK2 period. . 

3. Rise and fall times are not tested. 
4, Min. time not 100% tested. 
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9.5.2 A.C. Specification Tables (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = O°C to +85°C 


Table 9-5. 25 MHz 386T™ DX A.C. Characteristics 


25 ae | 
Parameter ae | DX 


Operating Frequency 
CLK2 Period 


Nm | ee) DOs Be o>) nD 
hm | “1S —(|R1O pe 


CLK2 eel Time 
W/R#, M/IO#, D/C#, ADS # Valid Delay 
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9.5.2 A. C. Specification Tables (Continued) 
Functional Seen’ Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


Table 9-5. 25 MHz 386T™ DX A.C. Characteristics (Continued) 


. 25 MHz 
Symbol Parameter 386T™ DX » 


21 
122 
123 
jt24 
125 
jee stew 
(ter 
28 
jo 
(130 


NOTES: | 


1. Float condition occurs when maximum output current becomes less than ILo in magnitude. Float delay is not 100% 
tested. 
2. These inputs are allowed to be asynchronous to CLK2. The setup and hold ppeciicanons are given for testing purposes, 
to assure recognition within a specific CLK2 period. 
3. Symbol Parameter Min 
To = 0°C t30 PEREQ, ERROR#, BUSY# HoldTime 4 
‘To = +85°C t80 PEREQ, ERROR#, BUSY# HoldTime 5 
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9.5.2 A.C. Specification Tables (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


Table 9.6. 20 MHz 386™ DX A.C. Characteristics 


Operating Frequency 4 20 MHz 


CNA# Hold Time | 1 
[ast6# SeupTime | 13 


READY # Setup Time 12 


Eecoeed 
eed 
ee! 
ae 
a 
READY#HoldTime | 4 | 
c= lll al 
Setup Time 
—— 
| ——— 
ees 


<W 
So 
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He 


ty 
toa 
tab 
t3a 
t3b 
ty 
ts 
tg 
t7 
tg | 
ty 
20 
tig 
t14 
is 
t16 
1417 
t18 
19 


DO-—D31 Read 
Hold Time 


4 
11 

HOLD Setup Time 
12 


HOLD Hold Time 
RESET SetupTime | 12 


t20 
tor 
too 
to3 
to4 
tos 


Half of CLK2 
Frequency 


9-3 at 2V 


at (Voc — 0.8V) _ 
at 2V 

at0.8V 

(Voc — 0.8V) to 0.8V 
0.8V to (Voc — 0.8V) 
C. = 120 pF 


(Note 1) 
CL = 75 pF 


(Note 1) 


© 


(?) 
= 
I 
~] 
on 
ao] 
nN 


re 
a 


5c | C. = 120pF 


(Note 1) 
C. = 75 pF 
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9.5.2 A.C. Specification Tables (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


Table 9-6. 20 MHz 386™ DX A.C. Characteristics (Continued) 


20 MHz | 
—386T™ DX 


- (Note 2) 


RESET Hold Time 
NMI, INTR Setup Time 


NMI, INTR Hold Time (Note 2) 
PEREQ, ERROR #, BUSY # 9-4 (Note 2) © 
Setup Time 

PEREQ, ERROR #, BUSY # 9-4 (Note 2) 
Hold Time | | | 7: 


NOTES: a | 

1. Float condition occurs when maximum output current becomes less than ILo in magnitude. Float delay is not 100% 
tested. | pa } 
2. These inputs are allowed to be asynchronous to CLK2. The setup and hold specifications are given for testing purposes, 
to assure recognition within a specific CLK2 period. | . 


~ 
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9.5.3 A.C. Test Loads 9.5.4 A.C. Timing Waveforms 


386 DX CPU 
OUTPUT 


T° 


231630-38 
CL = 120 pF on A2-A31, DO-D31 
CL = 75 pF on BEO#-BE3#, W/R#, M/IO#, D/C#, ADS#, 
LOCK #, HLDA 
C, includes ail parasitic capacitances. 
231630-39 


Figure 9-2.A.C.TestLoad | Figure 9-3. CLK2 Timing 
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Figure 9-4. Input Setup and Hold Timing 


5-417 


intel 386™ DX MICROPROCESSOR 


$2 
CLK2 | 


© 


BEO#—BE3#, 


MIN MAX 
LOcKs VALID n RX VALID net 


W/R#, M/10 OS | MIN MAX 
0/c#. ADS _VALID n KAQQQYYK VALID nt 
(6) MIN MAX 
A2—A31 VALID n RAYA VALID n+ 


231630-41 


Figure 9-5. Output Valid Delay Timing 


T1) 


¢ ; ! xy) | | 
aol _f\ LAS 


VALID n! 
—— 231630-79 231630-80 
Figure 9-5a. Write Data Valid Delay Timing Figure 9-5b. Write Data Hold Timing 
(25 MHz, 33 MHz) : (25 MHz, 33 MHz) 


DO-D31 [ VALID n ! VALID n+1 


231630-81 


Figure 9-5c. Write Data Valid Delay Timing (20 MHz) 
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9.5.5 Typical Output Valid Delay Versus Load Capacitance 
at Maximum Operating Temperature (C, = 120 pF) 


os 
” 
a 
— 
= 
bal 
QO 
2 
a | 
< 
> 
= 
=) 
oa 
-_ 
= 
oO 


C, (picofarads) 
231630-77 


NOTE: 
This graph will not be linear outside of the C,_ range shown. 


9.5.6 Typical Output Valid Delay Versus Load Capacitance 
at Maximum Operating Temperature (C, = 75 pF) 


OUTPUT VALID DELAY (ns) 


75 100 125 


C, (picofarads) 
231630-82 


NOTE: 
| This graph will not be linear outside of the C_ range shown. 
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9.5.7 Typical Output Valid Delay Versus Load Capacitance 
at Maximum Operating Temperature (C; = 50 pF) 


—~ 
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75 100 125 
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| 231630-83 


NOTE: 
This graph will not be linear outside of the C, range shown. 


9.5.8 Typical Output Rise Time Versus Load Capacitance | 
at Maximum Operating Temperature 


RISE TIME (ns) 0.8V— 2.0V 


C, (picofarads) 


231630-78 


NOTE: . ; | ; 
This graph will not be linear outside of the C_ range shown. | 
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Ti OR T1 


BEO#-BE3¥, 
LOCK# 


W/R#, M/l0#, 
D/C#, ADS# 


DO-D31 — — —|— 
(HIGH Z) 

ALSO APPLIES TO DATA FLOAT WHEN WRITE 

CYCLE IS FOLLOWED BY READ OR IDLE 


04) MIN MAX a4) | T MIN MAX 


231630-42 


_ Figure 9-6. Output Float Delay and HLDA Valid Delay Timing 


+ RESET —-—> |= INITIALIZATION SEQUENCE ——-——_> 
$2 OR $1 @2 OR $1 


231630-43 


The second internal processor phase following RESET high-to-low transition (provided tgs and tag are met) is 2. 


Figure 9-7. RESET Setup and Hold Timing, and Internal Phase 
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10.0 Revision History 


This 386 DX data sheet, version -005, contains updates and improvements to previous versions. A revision 
summary is listed here for your convenience. 


The sections significantly revised since version -001 are: 


— 2.9.6 

2.9.7 

2.11.2 

2.12 

3.1 

4.4.3.3 

Figures 4-15a, 4-15b 
4.6.4 

4.6.6 


5.6 

5.8 

5.8.1 

Table 6-3 

7. | 

Figures 7-8, 7-9, 7-10 
6.2.3.4 


Sequence of exception checking table added. 
Instruction restart revised. © 

TLB testing revised. 

Debugging support revised. 

LOCK prefix restricted to certain instructions. | 


1/O privilege level and I/O permission bitmap added. 


I/O permission bitmap added. 


- Protection and I/O permission bitmap revised. 


Entering and leaving virtual 8086 mode through task switches, trap and interrupt 
gates, and IRET explained. 


Self-test signature stored in EAX. 
Coprocessor interface description added. 


‘Software testing for coprocessor presence added. 


PGA package thermal characteristics added. 
Designing for ICE-386 revised. 


_ ICE-386 clearance requirements added. 


Encoding of 32-bit address mode with no “sib” byte corrected. 


The sections significantly revised since version -002 are: 


Table 2-5 
Figure 4-15a 
Figure 5-28 
5.7 

9.4 

9.5 

Table 6-1 


Interrupt vector assignments updated. 
Bit_map__offset must be less than or equal to DFFFH. 


- 386 DX outputs remain in their reset state during self-test. 


Component and revision identifier history updated. 

20 MHz D.C. specifications added. | 

16 MHz A.C. specifications updated. 20 MHz A.C. specifications added. 
Clock counts updated. | 


The sections significantly revised since version -003 are: 


Table 2-6b 

2.9.8 

Figure 4-5 

5.4.3.4 

Figures 5-16, 5-17, 
5-19, 5-22 

9.5 


Interrupt priorities 2 and 3 interchanged. 

Double page faults do not raise double fault exception. 

Maximum-sized segments must have segments Base1; 9 = 0. 

BS16# timing corrected. 

BS16# timing corrected. BS16# must not be asserted once NA# has been 
sampled asserted in the current bus cycle. 


16 MHz and 20 MHz A.C. specifications revised. All timing parameters are now 
guaranteed at 1.5V test levels. The timing parameters have been adjusted to 
remain compatible with previous 0.8V/2.0V specifications. 
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The sections significantly revised since version -004 are: 


Chapter 4 25 MHz Clock data included. 

Table 2-4 Segment Register Selection Rules updated. 

5.4.4 Interrupt Acknowledge Cycles discussion corrected. 

Table 5-10 Additional Stepping Information added. 

Table 9-3 : Icc values updated. 

9.5.2 Table for 25 MHz A.C. Characteristics added. A.C. Characteristics tables reor- 
dered. 

Figure 9-5 Output Valid Delay Timing Figure reconfigured. Partial data now provided in addi- 
tional Figures 9-5a and 9-5b. 


Table 6-1 Clock counts updated and formats corrected. 


The sections significantly revised since version -005 are: 


Table of Contents Simplified. 

Chapter 1 Pin Assignment. 

2.3.6 _ Control Register 0. 

Table 2-4 - Segment override prefixes possible. 
Figure 4-6 Note added. 

Figure 4-7 Note added. 

5.2.3 Data bus state at end of cycle. 
5.2.8.4 Coprocessor error. 

5.5.3 Bus activity during and following reset. 
Figure 5-28 ERROR #. 

Chapter 6 Moved forward in datasheet. 
Chapter 7 Moved forward in datasheet. 
Chapter 8 Upgraded to chapter. 

Table 9-3 25 MHz Icc Typ. value corrected. 
Table 9-3 33 MHz D.C. Specifications added. 
Table 9-4 33 MHz A.C. Specifications added. 
Figure 9-5 t8a and ti0a added. 

Figure 9-5c | Added. 

9.5.6 Added derating for CO, = 75 pF. 
9.5.7 Added derating for C, = 50 pF. 


Figure 9.6 t8a and t10a added. 


The sections significantly revised since version -006 are: 


2.3.4 Alignment of maximum sized segments. 
2.9.8 Double page faults do not raise double fault exception. 
5.5.3 | ERROR# and BUSY # sampling after RESET. 
Figure 5-21 BS16# timing altered. 
Figure 5-26 READY # timing altered. 
Figure 5-28 - ERROR # timing corrected. 
6.2.3.1 Corrected Encoding of Register Field Chart. 
Chapter 7 Updated ICE-386 DX information. | 
9.5.2 Remove preliminary stamp on 25 MHz A.C. Specifications. 


9.5.2 Remove preliminary stamp on 33 MHz A.C. Specifications. 
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The sections significantly revised since version -007 are: 


Table of Contents Page numbers revised. 

Figure 5-15 »  BS16# timing altered. 

Figure 5-22 Previous cycle, T2 changed to Idle cycle, Ti. 

6.1 Note about wait states added. 

Table 6-1 Opcodes for AND, OR, and XOR instructions corrected. 

Table 6-1 Bits 3, 4, and 5 of the “mod r/m’ byte corrected for the LTR instruction. 
Table 8-2 | Reference to Figure 6-4 should be reference Figure 8-2. 

Table 8-2 Note #4 added. | 
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MATH COPROCESSOR 

m High Performance 80-Bit Internal m Upward Object-Code Compatible from 
Architecture 8087 and 80287 

m Implements ANSI/IEEE Standard 754- m Full-Range Transcendental Operations 
1985 for Binary Floating-Point for SINE, COSINE, TANGENT, 
Arithmetic ARCTANGENT and LOGARITHM 

m Six to Eleven Times 8087/80287 mw Built-In Exception Handling 
Performance m Operates Independently of Real, 

m Expands 386T™ DX CPU Data Types to Protected and Virtual-8086 Modes of 
Include 32-, 64-, 80-Bit Floating Point, © the 386T™ DX Microprocessor 
32-, 64-Bit Integers and 18-Digit BCD m Eight 80-Bit Numeric Registers, Usable 
Operands 


as Individually Addressable General 

m Directly Extends 386™ DX CPU Registers or as a Register Stack 
Instruction Set to Include | m Available in 68-Pin PGA Package 
Trigonometric, Logarithmic, (See Packaging Spec: Order #231369) 
Exponential and Arithmetic Instructions 
for All Data Types 


The Intel 387T™ DX Math CoProcessor (MCP) is an extension to the Intel 386™ microprocessor architecture. 
The combination of the 387 DX with the 386™ DX Microprocessor dramatically increases the processing 
speed of computer application software which utilize mathmatical operations. This makes an ideal computer 
workstation platform for applications such as financial modeling and spreadsheets, CAD/CAM, or graphics. 


The 387 DX Math CoProcessor adds over seventy mnemonics to the 386 DX Microprocessor instruction set. 
Specific 387 DX math operations include logarithmic, arithmetic, exponentional, and triginometric functions. 
The 387 DX supports integer, extended integer, floating point and BCD data formats, and fully conforms to the 


ANSI/IEEE floating point standard. 

The 387 DX Math CoProcessor is object code compatible with the 80387SX, and upward object code compati- 
ble from the 80287 and 8087 math coprocessors. Object code for 386 DX/387 DX is also compatible with the 
Intel 486™ microprocessor. The 387 DX is manufactured on 1 micron, CHMOS IV technology and packaged 
in a 68-pin PGA package. 


} 
BUS CONTROL LOGIC ' DATA INTERFACE AND CONTROL UNIT , FLOATING POINT UNIT 
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CONTROL WORD s@iu 
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; 240448-1 
Figure 0.1. 387T™ DX Math Coprocessor Block Diagram 
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386T™ DX Microprocessor Registers 


GENERAL REGISTERS SEGMENT REGISTERS 
31 0 15 0 


387™ DX MATH COPROCESSOR 


387™ DX MCP Data Registers 
Tag 
Field 
0 1 0 


Sign Significand 


Control Register {Instruction Pointer (in 386™ DX CPU) 
Status Register Data Pointer (in 386™ DX CPU) 
Tag Word 


15 


Figure 1.1. 386™ DX Microprocessor and 387™ DX Math Coprocessor Register Set 


1.0 FUNCTIONAL DESCRIPTION 


The 387™ DX Math Coprocessor provides arithme- 
tic instructions for a variety of numeric data types in 
386™ DX Microprocessor systems. It also executes 
numerous built-in transcendental functions (e.g. tan- 
gent, sine, cosine, and log functions). The 387 DX 
MCP effectively extends the register and instruction 
set of a 386 DX Microprocessor system for existing 
data types and adds several new data types as well. 
Figure 1.1 shows the modei of registers visible to 
programs. Essentially, the 387 DX MCP can be treat- 
ed as an additional resource or an extension to the 
386 DX Microprocessor. The 386 DX Microproces- 
sor together with a 387 DX MCP can be used as a 
single unified system. 


The 387 DX MCP works the same whether the 386 
DX Microprocessor is executing in real-address 
mode, protected mode, or virtual-8086 mode. All 
memory access is handled by the 386 DX Micro- 
processor; the 387 DX MCP merely operates on in- 
structions and values passed to it by the 386 DX 
Microprocessor. Therefore, the 387 DX MCP is not 


sensitive to the processing mode of the 386 DX Mi- 


croprocessor. 


In real-address mode and virtual-8086 mode, the 
386 DX Microprocessor and 387 DX MCP are com- 
pletely upward compatible with software for 
8086/8087, 80286/80287 real-address mode, and 


386 DX Microprocessor and 80287 Coprocessor 


real-address mode systems. 


In protected mode, the 386 DX Microprocessor and 
387 DX MCP are completely upward compatible with 
software for 80286/80287 protected mode, and 386 
DX Microprocessor and 80287 Coprocessor protect- 
ed mode systems. 


The only differences of operation that may appear 
when 8086/8087 programs are ported to a protect- 
ed-mode 386 DX Microprocessor and 387 DX MCP 
system (not using virtual-8086 mode), is in the for- 
mat of operands for the administrative instructions 
FLDENV, FSTENV, FRSTOR and FSAVE. These in- 
structions are normally used only by exception han- 
diers and operating systems, not by applications 
programs. 


The 387 DX MCP contains three functional units that 
can operate in parallel to increase system perform- 
ance. The 386 DX Microprocessor can be transfer- 
ring commands and data to the MCP bus contro/ 
logic for the next instruction while the MCP floating- 
point unit is performing the current numeric instruc- 
tion. 
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2,0 PROGRAMMING INTERFACE 


The MCP adds to the 386 DX Microprocessor sys- 
tem additional data types, registers, instructions, and 
interrupts specifically designed to facilitate high- 
speed numerics processing. To use the MCP re- 
quires no special programming tools, because all 
new instructions and data types are directly support- 
ed by the 386 DX CPU assembler and compilers for 
high-level languages. All 8086/8088 development 
tools that support the 8087 can also be used to de- 
velop software for the 386 DX Microprocessor and 
387 DX Math Coprocessor in real-address mode or 
virtual-8086 mode. All 80286 development tools that 
‘support the 80287 can also be used to develop soft- 
ware for the 386 DX Microprocessor and 387 DX 
Math Coprocessor. 


All communication between the 386 DX Microproc- 
essor and the MCP is transparent to applications 
software. The CPU automatically controls the MCP 
whenever a numerics instruction is executed. All 
physical memory and virtual memory of the CPU are 
available for storage of the instructions and oper- 
ands of programs that use the MCP. All memory ad- 
dressing modes, including use of displacement, 
base register, index register, and scaling, are avail- 
able for addressing numerics operands. 


Section 6 at the end of this data sheet lists by class 
the instructions that the MCP adds to the instruction 
set of the 386 DX Microprocessor system. 
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2.1 Data Types 


Table 2.1 lists the seven data types that the 387 DX 
MCP supports and presents the format for each 


| type. Operands are stored in memory with the least 


significant digit at the lowest memory address. Pro- 
grams retrieve these values by generating the low- 
est address. For maximum system performance, all 
operands should start at physical-memory address- 
es evenly divisible by four (doubleword boundaries); 
Operands may begin at any other addresses, but will 
require extra memory cycles to access the entire op- 
erand. 


Internally, the 387 DX MCP holds all numbers in the 
extended-precision real format. Instructions that 
load operands from memory automatically convert 
Operands represented in memory as 16-, 32-, or 64- 
bit integers, 32- or 64-bit floating-point numbers, or 
18-digit packed BCD numbers into extended-preci- 
sion real format. Instructions that store operands in 
memory perform the inverse type conversion. 


2.2 Numeric Operands 


A typical MCP instruction accepts one or two oper- 
ands and produces a single result. In two-operand 
instructions, one operand is the contents of an MCP 
register, while the other may be a memory location. 
The operands of some instructions are predefined; 
for example FSQRT always takes the square root of 
the number in the top stack element. 
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Table 2.1. 387T™ DX MCP Data Type Representation in Memory 


| Most Significant Byte = Highest Addressed Byte 
Date Range Precision 
Formats 7 ol7 ol7 0 7 of7 0/7 o|7 0 
Word Integer +104 ww FO COMPLEMENT) 
=“ ae | ee 


=| 
== 
omemreaen} ee] som FT BE [sores 


BIASED 
S| eee stzmricano 


Extended +4932 . | 
+10 4 
one ne - 
9 64 63 0 


NOTES: 
— (1) S = Sign bit (0 = positive, 1 = negative) 
(2) d, = Decimal digit (two per byte) 
(3) X = Bits have no significance; 387™ DX MCP ignores when loading, zeros when storing 
(4)4 = Position of implicit binary point 
(5) | = Integer bit of significand; stored in temporary real, implicit in single and double precision 
(6) Exponent Bias (normalized values): 

Single: 127 (7FH) 

Double: 1023 (8FFH) 

Extended Real: 16383 (SFFFH) 
(7) Packed BCD: (— 1) (D47...Do) 
(8) Real: (—1)S (2E-BIAS) (Fo Fy...) 


(TWO'S 
COMPLEMENT) 


(TWwO'S 
COMPLEMENT) 


MAGNITUDE 
dg 
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| NOTE: 
The index i of tag(i) is not top-relative. A program typically uses the ‘ ‘top” field of Status Word to determine WhicR sh tag(i 


field refers to logical top of stack. 
TAG VALUES: 

00 = Valid 

01 = Zero 


10 = QNaN, SNaN, Infinity, Denormal and Unsupported Formats 


11 = Empty 


Figure 2.1. 387™ DX MCP Tag Word 


2.3 Register Set 


Figure 1.1 shows the 387 DX MCP register set. 
When an MCP is present in a system, programmers 
may use these registers in addition to the registers 
normally available on the 386 DX CPU. 


2.3.1 DATA REGISTERS 
387 DX MCP computations use the MCP’s data reg- 


isters. These eight 80-bit registers provide the equiv- | 


alent capacity of twenty 32-bit registers. Each of the 
_ @ight data registers in the MCP is 80 bits wide and is 
divided into “fields” corresponding to the MCPs ex- 
tended- preneien real data type. 


The 387 DX MCP register set can be accessed ei- 
ther as a stack, with instructions operating on the 
top one or two stack elements, or as a fixed register 
set, with instructions operating on explicitly designat- 
ed registers. The TOP field in the status word identi- 
fies the current top-of-stack register. A “‘push’’ oper- 
ation decrements TOP by one and loads a value into 
the new top register. A “pop” operation stores the 


value from the current top register and then incre- 


ments TOP by one. Like the 386 DX Microprocessor 
stacks in memory, the MCP register stack grows 
“down” toward lower-addressed registers. 


Instructions may address the data registers either 
implicitly or explicitly. Many instructions operate on 
the register at the TOP of the stack. These instruc- 


tions implicitly address the register at which TOP 


points. Other instructions allow the programmer to 
explicitly specify which register to user. This explicit 
register addressing is also relative to TOP. 


2.3.2 TAG WORD 


The tag word marks the content of each numeric 
data register, as Figure 2.1 shows. Each two-bit tag 
represents one of the eight numerics registers. The 
principal function of the tag word is to optimize the 
MCPs performance and stack handling by making it 
possible to distinguish between empty and nonemp- 


_ ty register locations. It also enables exception han- 


dlers to check the contents of a stack location with- 
out the need to perform complex decoding of the 
actual data. 
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ERROR SUMMARY STATUS 
STACK FLAG 


EXCEPTION FLAGS: 
PRECISION 
UNDERFLOW 
OVERFLOW 
ZERO DIVIDE 
DENORMALIZED OPERAND 


INVALID OPERATION 
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MCP BUSY 
TOP OF STACK POINTER 
CONDITION CODE 
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ES is set if any unmasked exception bit is set; cleared otherwise. 


See Table 2.2 for interpretation of condition code. 


TOP values: 
000 = Register 0 is Top of Stack 
001 = Register 1 is Top of Stack 
e 


e 
111 = Register 7 is Top of Stack 


For definitions of exceptions, refer to the section entitled 


“Exception Handling” 


Figure 2.2. MCP Status Word 


2.3.3 STATUS WORD 


The 16-bit status word (in the status register) shown 
in Figure 2.2 reflects the overall state of the MCP. It 
may be read and inspected by CPU code. 


Bit 15, the B-bit (busy bit) is included for 8087 com- 
patibility only. It reflects the contents of the ES bit 
(bit 7 of the status word), not the status of the 
BUSY # output of the 387 DX MCP. : 


Bits 13-11 (TOP) point to the 387 DX MCP register 
that is the current top-of-stack. 


The four numeric condition code bits (C3-Co) are 
similar to the flags in a CPU; instructions that per- 
form arithmetic operations update these bits to re- 
flect the outcome. The effects of these instructions 
on the condition code are summarized in Tables 2.2 
through 2.5. . 


Bit 7 is the error summary (ES) status bit. This bit is 
set if any unmasked exception bit is set; it is clear 
otherwise. If this bit is set, the ERROR# signal is 
asserted. 


Bit 6 is the stack flag (SF). This bit is used to distin- 
guish invalid operations due to stack overflow or un- 
derflow from other kinds of invalid operations. When 
SF is set, bit 9 (C1) distinguishes between stack 
overflow (C; = 1) and underflow (C; = 0). 


Figure 2.2 shows the six exception flags in bits 5-0 
of the status word. Bits 5-0 are set to indicate that 
the MCP has detected an exception while executing 
an instruction. A later section entitled “Exception 
Handling’”’ anes how they are set and used. 


Note that when a new value is igadied into the status 
word by the FLDENV or FRSTOR instruction, the 


_ value of ES (bit 7) and its reflection in the B-bit (bit 


15) are not derived from the values loaded from 
memory but rather are dependent upon the values of 
the exception flags (bits 5-0) in the status word and 
their corresponding masks in the control word. If ES 
is set in such a case, the ERROR # one of the 
MCP is activated immediately. 
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Table 2.2. Condition Code Interpretation 


FPREM, FPREM1.. | .—. Three least significant bits 
(see Table 2.3) | of quotient 


Reduction 
0 = complete 
1 = incomplete 


FCOM, FCOMP, 3 

FCOMPP, FTST, ‘Result of comparison Operand is not 
FUCOM, FUCOMP, (see Table 2.4) . or O/U# comparable 
FUCOMPP, FICOM, (Table 2.4) 
FICOMP 


FXAM Operand class Sign Operand class 
(see Table 2.5) or O/U# (Table 2.5) 


FCHS, FABS, FXCH, 

FINCSTP, FDECSTP, © 
~ Constant loads, 

FXTRACT, FLD, 

FILD, FBLD, 
FSTP (ext real) 


Zero | 
or O/U# 


UNDEFINED UNDEFINED 


FIST, FBSTP, 

FRNDINT, FST, 

FSTP, FADD, FMUL, ee 

FDIV, FDIVR, UNDEFINED UNDEFINED 
FSUB, FSUBR, | or O/U# 


FSCALE, FSQRT, 
FPATAN, F2XM1, 
FYL2X, FYL2XP1 


FPTAN, FSIN. Roundup Reduction 
FCOS, FSINCOS UNDEFINED or O/U#, 0 = complete 
oo | undefined 1 = incomplete 


ifC2 = 1 


FLDENV, FRSTOR ~ | Each bit loaded from memory | 


FLDCW, FSTENV, 
FSTCW, FSTSW, 

- FCLEX, FINIT, 
FSAVE 


UNDEFINED 


O/U# When both IE and SF bits of status word are set, indicating a stack exception, this bit 
distinguishes between stack overflow (C1 = 1) and underflow (C1 = 0). 


Reduction if FPREM or FPREM1 produces a remainder that is less than the modulus, reduction is 
| | complete. When reduction is incomplete the value at the top of the stack is a partial 
remainder, which can be used as input to further reduction. For FPTAN, FSIN, FCOS, and 
FSINCOS, the reduction bit is set if the operand at the top. of the stack is too large. In this 

_ case the original operand remains at the top of the stack. 


' Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the 
instruction was upward. | 


UNDEFINED _ Do not rely on finding any specific value in these bits. - 
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Table 2.3. Condition Code Interpretation after FPREM and FPREM‘1 Instructions 


Condition Code Interpretation after FPREM and FPREM1 


Incomplete Reduction: 
further interation required 
for complete reduction 


NO 


Q MOD8 


Complete Reduction: 
CO, C3, C1 contain three least 
significant bits of quotient 


ween 


pe ; e 


Table 2.4. Condition Code Resulting from Comparison 


TOP > Operand 
TOP < Operand 
TOP = Operand 
Unordered 


+ Unsupported 
+ NaN . 
— Unsupported 
— NaN 

+ Normal 

+ Infinity 

— Normal 

— Infinity 

+0 

+ Empty 

=D 

— Empty 

+ Denormal 

— Denormal 


0 
0 
0 
0 
0 
0 
0 
0 
1 
: 
; 
4 
; 
: 


ant OOO OC HSH 20000 
~~ O- st OO | = COO =| =| CO OO 
oo-_-o0+/ 00-0 0 


5-435 


intel 


2.3.4 INSTRUCTION AND DATA POINTERS 


Because the MCP operates in parallel with the CPU, 
any errors detected by the MCP may be reported 
after the CPU has executed the ESC instruction 
which caused it. To allow identification of the failing 
numeric instruction, the 386 DX Microprocessor and 


387 DX Math Coprocessor contains two pointer reg- . 


isters that supply the address of the failing numeric 
instruction and the address of its numeric memory 
operand (if appropriate). 


The instruction and data pointers are provided for 
user-written error handlers. These registers are ac- 
tually located in the 386 DX CPU, but appear to be 
located in the MCP because they are accessed by 
the ESC instructions FLDENV, FSTENV, FSAVE, 
and FRSTOR. (In the 8086/8087 and 80286/80287, 
these registers are located in the MCP.) Whenever 


387T™ DX MATH COPROCESSOR | 


the 386 DX CPU decodes a new ESC instruction, it 
saves the address of the instruction (including any 
prefixes that may be present), the address of the 


operand (if present), and the opcode. 


The instruction and data pointers appear in one of 
four formats depending on the operating mode of 
the 386 DX Microprocessor (protected mode or real- 
address mode) and depending on the operand-size 
attribute in effect (32-bit operand or 16-bit operand). 
When the 386 DX Microprocessor is in virtuai-8086 
mode, the real-address mode formats are used. 
(See Figures 2.3 through 2.6.) The ESC instructions 
FLDENV, FSTENV, FSAVE, and FRSTOR are used 
to transfer these values between the 386 DX Micro- 
processor registers and memory. Note that the value 
of the data pointer is undefined if the prior ESC in- 
struction did not have a memory operand. | 


32-BIT PROTECTED MODE FORMAT 


RESERVED CONTROL WORD 


RESERVED STATUS WORD 
RESERVED TAG WORD 


00000 


Figure 2.3. Protected Mode 387T DX MCP Instruction and 
Data Pointer Image in Memory, 32-Bit Format 
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32-BIT REAL-ADDRESS MODE FORMAT 
15 7 0 


31 23 
ur eet 

0000 | INSTRUCTION POINTER 31.16 Oy | OPCODE 10..0 110 

OPERAND POINTER 31..16 18 


Figure 2.4. Real Mode 387™ DX MCP Instruction and Data Pointer Image in Memory, 32-Bit Format 


16-BIT PROTECTED MODE FORMAT | 16-BIT REAL-ADDRESS MODE AND. 
15 a 0 VIRTUAL-8086 MODE FORMAT 
15 7 0 


CONTROL WORD | 
: CONTROL WORD 


STATUS WORD 


| , — STATUS WORD 
| TAG WORD - 


TAG WORD 
| elas INSTRUCTION POINTER 15..0 
CS SELECTOR | 1P19.16 o OPCODE 10..0 
OPERAND OFFSET : OPERAND POINTER 15..0 
OPERAND SELECTOR DP 19.16|0/0 0000000000] 


Figure 2.5. Protected Mode 387™ DX MCP Figure 2.6. Real Mode 387™ DX MCP 
instruction and Data Pointer Instruction and Data Pointer 
Image in Memory, 16-Bit Format image in Memory, 16-Bit Format 
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EXCEPTION MASKS: 
PRECISION 
UNDERFLOW 
OVERFLOW 
ZERO DIVIDE 
DENORMALIZED OPERAND 
INVALID OPERATION - 


Precision Control 
00—24 bits (single precision) 
01—(reserved) 
- 10—53 bits (double precision) 
11—64 bits (extended precision) 
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RESERVED 
RESERVED* 
ROUNDING CONTROL 
PRECISION CONTROL 


* "0" AFTER RESET OR FINIT; 

CHANGEABLE UPON LOADING THE 

CONTROL WORD (CW). PROGRAMS 
- MUST IGNORE THIS BIT. 


240448-4 
Rounding Control 
00—Round to nearest or even 
01—Round down (toward — ©) 
10—Round up (toward + 0) 
11—Chop (truncate toward zero) 


Figure 2.7. 387T™ DX MCP Control Word 


2.3.5 CONTROL WORD 


The MCP provides several processing options that 
are selected by loading a control word from memory 
into the control register. Figure 2.7 shows the format 
and encoding of fields in the control word. 


The low-order byte of this control word configures 


the MCP error and: exception masking. Bits 5-0 of © 


the control word contain individual masks for each of 
the six exceptions that the MCP recognizes. 


The high-order byte of the control word configures 
the MCP operating mene, neg precision and 
— rounding. 


e Bit 12 no longer defines infinity control and is a 


reserved bit. Only affine closure is supported for ' 


infinity arithmetic. The bit is initialized to zero after 
RESET or.FINIT and is changeable upon loading 
the CW. Programs must ignore this bit. 


e The rounding control (RC) bits (bits 11-10) pro- 
vide for directed rounding and true chop, as well 
as the unbiased round to nearest even mode 
specified in the IEEE standard. Rounding control 


affects only those instructions that perform 
rounding at the end of the operation (and thus 
can generate a precision exception); namely, 
FST, FSTP, FIST, all arithmetic instructions (ex- 
cept FPREM, FPREM1, FXTRACT, FABS, and 
FCHS), and all transcendental instructions. | 


e The precision control (PC) bits (bits 9-8) can be 
used to set the MCP internal operating precision 
of the significand at less than the default of 64 
bits (extended precision). This can be useful in 
providing compatibility with early generation arith- 
metic processors of smaller precision. PC affects 
only the instructions ADD, SUB, DIV, MUL, and 
SQRT. For all other instructions, either the preci- 
sion is:determined by the opcode or extended 
precision is used. 


2.4 Interrupt Description 


Several interrupts of the 386 DX CPU are used to > 
report exceptional conditions while executing nu- 
meric programs in either real or protected mode. Ta- 
ble 2.6 shows these interrupts and their causes. 
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Table 2.6. 386™ DX Microprocessor Interrupt Vectors Reserved for MCP 


An operand of a coprocessor instruction wrapped around an addressing limit (OFFFFH for 
small segments, OFFFFFFFFH for big segments, zero for expand-down segments) and 
spanned inaccessible addresses@. The failing numerics instruction is not restartable. The 
address of the failing numerics instruction and data operand may be lost; an FSTENV does 
not return reliable addresses. As with the 80286/80287, the segment overrun exception 
should be handled by executing an FNINIT instruction (i.e. an FINIT without a preceding 
WAIT). The return address on the stack does not necessarily point to the failing instruction 
nor to the following instruction. The interrupt can be avoided by never allowing numeric 
data to start within 108 bytes of the end of a segment. 


The first word or doubleword of a numeric operand is not entirely within the limit of its 
segment. The return address pushed onto the stack of the exception handler points at the 
ESC instruction that caused the exception, including any prefixes. The 387™ DX MCP has 
not executed this instruction; the instruction pointer and data pointer register refer to a 
previous, correctly executed instruction. 


The previous numerics instruction caused an unmasked exception. The address of the 
faulty instruction and the address of its operand are stored in the instruction pointer and 
data pointer registers. Only ESC and WAIT instructions can cause this interrupt. The 386™ 
_DX CPU return address pushed onto the stack of the exception handler points to a WAIT > 
or ESC instruction (including prefixes). This instruction can be restarted after clearing the 


Interrupt |. | 
Cause of Interrupt 
7 
exception condition in the MCP. FNINIT, FNCLEX, FNSTSW, FNSTENV, and FNSAVE 


An ESC instruction was encountered when EM or TS of the 386™ DX CPU control register 
zero (CRO) was set. EM = 1 indicates that software emulation of the instruction is 
required. When TS is set, either an ESC or WAIT instruction causes interrupt 7. This 
indicates that the current MCP context may not belong to the current task. 
16 
| cannot cause this interrupt. 
a. An operand may wrap around an addressing limit when the segment limit is near an addressing limit and the operand is near the largest valid 
address in the segment. Because of the wrap-around, the beginning and ending addresses of such an operand will be at opposite ends of the 
segment. There are two ways that such an operand may also span inaccessible addresses: 1) if the segment limit is not equal to the addressing 
limit (e.g. addressing limit is FFFFH and segment limit is FFFDH) the operand will span addresses that are not within the segment (e.g. an 8-byte 
operand that starts at valid offset FFFC will span addresses FFFC-FFFF and 0000-0003; however addresses FFFE and FFFF are not valid, 


because they exceed the limit); 2) if the operand begins and ends in present and accessible pages but intermediate bytes of the operand fall in a 
not-present page or a page to which the procedure does not have access rights. 


2.5 Exception Handling 


_ The 387 DX MCP detects six different exception 
conditions that can occur during instruction execu- 
tion. Table 2.7 lists the exception conditions in order 
of precedence, showing for each the cause and the 
default action taken by the MCP if the exception is 
masked by its corresponding mask bit in the control 
word. 


Any exception that is not masked by the control 
word sets the corresponding exception flag of the 
status word, sets the ES bit of the status word, and 
asserts the ERROR# signal. When the CPU at- 
tempts to execute another ESC instruction or WAIT, 
exception 7 occurs. The exception condition must 
be resolved via an interrupt service routine. The 386 
DX Microprocessor saves the address of the float- 
ing-point instruction that caused the exception and 
the address of any memory operand eS by that 
instruction. 


2.6 Initialization 


387 DX MCP initialization software must execute an 
FNINIT instruction (i.e. an FINIT without a preceding 
WAIT) to clear ERROR #. After a hardware RESET, 
the ERROR# output is asserted to indicate that a 
387 DX MCP is present. To accomplish this, the IE 
and ES bits of the status word are set, and the IM bit 
in the control word is reset. After FNINIT, the status 
word and the control word have the same ° values as 
in an 80287 after RESET. 
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2.7 8087 and 80287 Compatibility 


This section summarizes the differences between 
the 387 DX MCP and the 80287. Any migration from 
the 8087 directly to the 387 DX MCP must also take 
into account the differences between the 8087 and 
the 80287 as listed in Appendix A. 


Many changes have been designed into the 387 DX 


MCP to directly support the IEEE standard in hard- 
ware. These changes result in increased perform- 
ance by eliminating the need for software that sup- 
ports the standard. 


2.7.1 GENERAL DIFFERENCES 


The 387 DX MCP supports only affine closure for 
infinity arithmetic, not projective closure. Bit 12 of 
_ the Control Word (CW) no longer defines infinity 
control. It is a reserved bit; but it is initialized to zero 
after RESET or FINIT and is changeable upon load- 
ing the CW. Programs must ignore this bit. . 
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-Operands for FSCALE and FPATAN are no longer 


restricted in range (except for +°); F2XM1 and 
FPTAN accept a wider range of operands. 


The results of transcendental operations may be 
slightly different from those computed by 80287. _ 


In the case of FPTAN, the 387 DX MCP supplies a 
true tangent result in ST(1), and (always) a floating 
point 1 in ST. 


Rounding control is in effect for FLD constant. 


Software cannot change entries of the tag word to 
values (other than empty) that do not reflect the ac- 
tual register contents. 


After reset, FINIT, and incomplete FPREM, the 387 
DX MCP resets to zero the condition code bits C3- 
Co of the status word. 


In conformance with the IEEE standard, the 387 DX | 
MCP does not support the special data formats: 
pseudozero, pseudo-NaN, pseudoinfinity, and un- 
normal. 


Table 2.7. Exceptions 


Exception Default Action 
| ai (if exception is masked) 


Operation on a signaling NaN, unsupported format, 
indeterminate form (0* 0, 0/0, (+ 0) + {— ©), etc.), or 
stack overflow/underflow (SF is also set). 


Denormalized | Atleast one of the operands is denormalized, i.e. ithas 
| Operand —_|_ the smallest exponent but a nonzero significand. 7 


Zero Divisor The divisor is zero while the dividend is a noninfinite, 
nonzero number. 


Invalid 
Operation 


Result is a quiet NaN, integer 
indefinite, or BCD indefinite 


Normal processing 
continues 


Result is 00 | 


Overflow | The result is too large in magnitude to fit in the specified Result is largest finite value 
format. | Or 0° —_ | 


Underflow The true result is nonzero but too small to be 
| represented in the specified format, and, if underflow zero 
exception is masked, denormalization causes loss of 
| | accuracy. : 


The true result is not exactly representable in the 
specified format (e.g. 1/3); the result is rounded | 


Inexact 
Result — 


(Precision) according to the rounding mode. 


Result is denormalized or 


Normal processing 
continues 
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2.7.2 EXCEPTIONS 


A number of differences exist due to changes in the 
IEEE standard and to functional improvements to 
the architecture of the 387 DX MCP: 


1. When the overflow or underflow exception is 
masked, the 387 DX MCP differs from the 80287 
in rounding when overflow or underflow occurs. 
The 387 DX MCP produces results that are con- 
sistent with the rounding mode. 


2. When the underflow exception is masked, the 
387 DX MCP sets its underflow flag only if there 
is also a loss of accuracy during denormaliza- 
tion. 


3. Fewer invalid-operation exceptions due to de- 
normal operands, because the _ instructions 
FSQRT, FDIV, FPREM, and conversions to BCD 
or to integer normalize denormal operands be- 
fore proceeding. 


4. The FSQRT, FBSTP, and FPREM instructions 
may cause underflow, because they support de- 
- normal operands. 


5. The denormal exception can occur during the 
transcendental instructions and the FXTRACT 
instruction. 


6. The denormal exception no longer takes prece- 
dence over all other exceptions. 


7. When the denormal exception is masked, the 
387 DX MCP automatically normalizes denormal 
operands. The 8087/80287 performs unnormal 
arithmetic, which might produce an unnormal re- 
sult. 


8. When the operand is zero, the FXTRACT in- 
struction reports a zero-divide exception and 
leaves — © in ST(1). 

9. The status word has a new bit (SF) that signals 
when invalid-operation exceptions are due to 
stack underflow or overflow. 


10. FLD extended precision no longer reports denor- 
mal exceptions, because the instruction is not 
numeric. 


11. FLD single/double precision when the operand 
is denormal converts the number to extended 
precision and signals the denormalized operand 
exception. When loading a signaling NaN, FLD 
single/double precision signals an invalid-oper- 
and exception. 


12. The 387 DX MCP only generates quiet NaNs (as 
on the 80287); however, the 387 DX MCP distin- 
guishes between quiet NaNs and _ signaling 
NaNs. Signaling NaNs trigger exceptions when 
they are used as operands; quiet NaNs do not 
(except for FCOM, FIST, and FBSTP which also 
raise IE for quiet NaNs). 

13. When stack overflow occurs during FPTAN and 
overflow is masked, both ST(0) and ST(1) con- 
tain quiet NaNs. The 80287/8087 leaves the 
Original operand in ST(1) intact. 
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14. When the scaling factor is +o, the FSCALE 
(ST(0), ST(1)) instruction behaves as follows 
(ST(0) and ST(1) contain the scaled and scaling 
operands respectively): 


e FSCALE(0,°) generates the invalid operation 
exception. 


e FSCALE(finite, — °°) generates zero with the 
same sign as the scaled operand. 


e FSCALE(finite, + °°) generates ° with the 
same sign as the scaled operand. 


The 8087/80287 returns zero in the first case 
and raises the invalid-operation exception in the 
other cases. 


15. The 387 DX MCP returns signed infinity/zero as 
the unmasked response to massive overflow/ 
underflow. The 8087 and 80287 support a limit- 
ed range for the scaling factor; within this range 
either massive overflow/underflow do not 0 occur 
or undefined results are produced. 


3.0 HARDWARE INTERFACE 


In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # is present after 
the signal name, the signal is asserted when at the 
high voltage level. 


3.1 Signal Description 


In the following signal descriptions, the 387 DX Math 
Coprocessor pins are grouped by function as fol- 
lows: 


1. Execution control—CPUCLK2, NUMCLK2, CKM, 
RESETIN 


2. MCP handshake—PEREQ, BUSY #, ERROR # 


3. Bus interface pins—D31-D0, W/R#, om 
READY #, READYO# 


NPS1#, NPS2, 


CMD0 # 
5. Power supplies—Vcc, Vss 


Table 3.1 lists every pin by its identifier, gives a brief 
description of its function, and lists some of its char- 
acteristics. All output signals are tristate; they leave 
floating state only when STEN is active. The output 
buffers of the bidirectional data pins D31-—D0 are 
also tristate; they leave floating state only in read 
cycles when the MCP is selected (i.e. when STEN, 
NPS1 #, and NPS2 are all active). 


Figure 3.1 and Table 3.2 together show the location 


of every pin in the pin grid array. 
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Table 3.1. 387T™ DX MCP Pin Summary 


CPUCLK2 386™ DX CPU CLock 2 

NUMCLK2 387T™ DX MCP CLocK 2 

CKM 387™ DX MCP CLockKing Mode 

RESETIN System reset i CPUCLK2 


PEREQ_. Processor Extension i CPUCLK2/STEN 


REQuest | | 3 
BUSY # Busy status CPUCLK2/STEN 
ERROR # Error status NUMCLK2/STEN 


D31-D0O Data pins _ CPUCLK2 
W/R# Write/Read bus cycle i | . CPUCLK2 

ADS# ADdress Strobe | CPUCLK2 
READY # Bus ready input | CPUCLK2 
READYO# Ready output CPUCLK2/STEN 


STatus ENable i : CPUCLK2 
MCP select # 1 CPUCLK2 | 
MCP select #2 : i CPUCLK2 
CoMmanD CPUCLK2 


NOTE: . 
STEN is referenced to only when getting the output pins into or out of tristate mode. 


Table 3.2. 387™ DX MCP Pin Cross-Reference 


ADS # Dis L4 
BUSY # D19 | K4 
CKM D20 3 : 
CPUCLK24 p21 | AG, AQ, B4, 
D22 : E1, F1, F10, 
D23 J2, K5, 
D24 | L7 
D25 3 
D26 Vss B2, B7, C11, 
D27 7 | E2, F2, F11, 
D28 : — (J1, J10, L5 
D29 , oe | 
D30 a NO CONNECT | K9 
D31 TIE HIGH K3, L9* 
ERROR# | 
NPS1# 
NPS2 
NUMCLK2 
PEREQ 
READY # 
READYO# 
RESETIN 


*Tie high pins may either be tied high with a pullup resistor or connected to Vcc. 
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Vsg  PEREQ 
e@ e 


Voc BUSY# ERROR# 
e e e 


TIE HIGH _READYO# 
e e 


W/R# STEN 
e ° 


Veo Vss 
e @ 


PIN SIDE VIEW - NPS2 NPS1¢ 
(BOTTOM) - ‘ 


ADS# Voc 
e e 


READY# CMDO#¥ 
e @ 


N/C TIE HIGH 


Vsg CPUCLK2 RESETIN 
e @ e@ 


CKM NUMCLK2 
© e 
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PEREQ 
e 


ERROR# BUSY# 
@ e@ 


READYO# TIE HIGH 
e e 


STEN W/R# 
e e 


Vss_ ‘Vee 
@ e 


NPS1# NPS2 TOP VIEW 


Veo ADS# 
e e 


CMDO# READY# 
® @ 


TIE HIGH N/C 
e e 


RESETIN CPUCLK2 Vgg 
e e e 


NUMCLK2 CKM 
e e 
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Figure 3.1. 387T™ DX MCP Pin Configuration 
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3.1.1 386™ DX CPU CLOCK 2 (CPUCLK2) 


This input uses the 386 DX CPU CLK2 signal to time 
the bus control logic. Several other MCP signals are 
referenced to the rising edge of this signal. When 
CKM = 1 (synchronous mode) this pin also clocks 
the data interface and control unit and the floating- 
point unit of the MCP. This pin requires MOS-level 
input. The signal on this pin is divided by two to pro- 
duce the internal clock signal CLK. 


3.1.2 387T™ DX MCP CLOCK 2 (NUMCLK2) 
When CKM = 0 (asynchronous mode) this pin pro- 
vides the clock for the data interface and control unit 


and the floating-point unit of the MCP. In this case, 
the ratio of the frequency of NUMCLK2 to the fre- 


386™px CPU 


NUMCLK2 


quency of CPUCLK2 must lie within the range 10:16 
to 14:10. When CKM = 1 (synchronous mode) this 
pin is ignored; CPUCLK2 is used instead for the data 
interface and control unit and the floating-point unit. 
This pin requires TTL-level input. 


3.1.3 387™ DX MCP CLOCKING MODE (CKM) 


This pin is a strapping option. When it is strapped to 
Vcc, the MCP operates in synchronous mode; when 
strapped to Vss, the MCP operates in asynchronous 
mode. These modes relate to clocking of the data 
interface and control unit and the floating-point unit 
only; the bus control logic always operates synchro- 
nously with respect to the 386 DX Microprocessor. 


INTERFACE SYNCHRONOUS 


NUMERIC 


CORE ASYNCHRONOUS | 


387™px MCP 
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Figure 3.2. Asynchronous Operation 
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3.1.4 SYSTEM RESET (RESETIN) 


A LOW to HIGH transition on this pin causes the 
MCP to terminate its present activity and to enter a 
dormant state. RESETIN must remain HIGH for at 
least 40 NUMCLK2 periods. The HIGH to LOW tran- 
sitions of RESETIN must be synchronous with 
, CPUCLK2, so that the phase of the internal clock of 


the bus control logic (which is the CPUCLK2 divided _ 


by 2) is the same as the phase of the internal clock 
of the 386 DX CPU. After RESETIN goes LOW, at 
least 50 NUMCLK2 periods must pass before the 
first MCP instruction is written into the 387 DX MCP. 
This pin should be connected to the 386 DX CPU 
RESET pin. Table 3.3 shows the status of other pins 
after a reset. 


Table 3.3. Output Pin Status During Reset 


[—Pinvae | PinName 
TriState OFF 


3.1.5 PROCESSOR EXTENSION REQUEST 
(PEREQ) 


When active, this pin signals to the 386 DX CPU that 
the MCP is ready for data transfer to/from its data 
FIFO. When all data is written to or read from the 
data FIFO, PEREQ is deactivated. This signal al- 
ways goes inactive before BUSY# goes inactive. 
This signal is referenced to CPUCLK2. It should be 
connected to the 386 DX CPU PEREQ input. Refer 
to Figure 3.8 for the timing relationships between 
this and the BUSY # and ERROR #¥ pins. 


3.1.6 BUSY STATUS (BUSY #) 


When active, this pin signals to the 386 DX CPU that 
the MCP is currently executing an instruction. This 
signal is referenced to CPUCLK2. It should be con- 
nected to the 386 DX CPU BUSY# pin. Refer to 
Figure 3.8 for the timing relationships between this 
and the PEREQ and ERROR #¥ pins. 
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3.1.7 ERROR STATUS (ERROR #) 


This pin reflects the ES bits of the status register. 
When active, it indicates that an unmasked excep- 
tion has occurred (except that, immediately after a 
reset, it indicates to the 386 DX Microprocessor that 
a 387 DX MCP is present in the system). This signal 
can be changed to inactive state only by the follow- 
ing instructions (without a preceding WAIT): FNINIT, 
FNCLEX, FNSTENV, and FNSAVE. This signal is 
referenced to NUMCLK2. It should be connected to 
the 386 DX CPU ERROR # pin. Refer to Figure 3.8 
for the timing relationships between this and the 
PEREQ and BUSY # pins. 


3.1.8 DATA PINS (D31-D0) 


These bidirectional pins are used to transfer data 
and opcodes between the 386 DX CPU and 387 DX 
MCP. They are normally connected directly to the 
corresponding 386 DX CPU data pins. HIGH state 
indicates a value of one. DO is the least significant 
data bit. Timings are referenced to CPUCLK2. 


3.1.9 WRITE/READ BUS CYCLE (W/R#) 


This signal indicates to the MCP whether the 386 DX 
CPU bus cycle in progress is a read or a write cycle. 
This pin should be connected directly to the 386 DX 
CPU W/R# pin. HIGH indicates a write cycle; LOW, 
a read cycle. This input is ignored if any of the sig- 
nals STEN, NPS1#, or NPS2 is inactive. Setup and 
hold times are referenced to CPUCLK2. 


_ 3.1.10 ADDRESS STROBE (ADS #) 


This input, in conjunction with the READY # input 
indicates when the MCP bus-control logic may sam- 
ple W/R# and the chip-select signals. Setup and 
hold times are referenced to CPUCLK2. This pin 
should be connected to the 386 DX CPU ADS # pin. 
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3.1.11 BUS READY INPUT (READY #) 


This input indicates to the MCP when a 386 DX CPU 
bus cycle is to be terminated. It is used by the bus- 
control logic to trace bus activities. Bus cycles can 
be extended indefinitely until terminated by 
READY #. This input should be connected to the 
same signal that drives the 386 DX CPU READY # 
input. Setup and hold times are referenced to 
CPUCLKk2. 


3.1.12 READY OUTPUT (READYO#) 


This pin is activated at such a time that write cycles 

are terminated after two clocks (except FLDENV 
_ and FRSTOR) and read cycles after three clocks. In 
configurations where no extra wait states are re- 
quired, this pin must directly or indirectly drive the 
386 DX CPU READY # input. Refer to section 3.4 


“Bus Operation” for details. This pin is activated » 


only during bus cycles that select the MCP. This sig- 
nal is referenced to CPUCLK2. 


3.1.13 STATUS ENABLE (STEN) 


This pin serves as a chip select for the MCP. When 
inactive, this pin forces BUSY #, PEREQ, ERROR#, 
and READYO# outputs into floating state. D31-—D0 
are normally floating and leave floating state only if 
STEN is active and. additional conditions are met. 
STEN also causes the chip to recognize its other 
chip-select inputs. STEN makes it easier to do on- 
board testing (using the overdrive method) of other 
chips in systems containing the MCP. STEN should 
be pulled up with a resistor so that it can be pulled 
down when testing. In boards that do not use on- 
board testing, STEN should be connected to Vcc. 
Setup and hold times are relative to CPUCLK2. Note 
that STEN must maintain the same setup and hold 
times as NPS1#,.NPS2, and CMDO# (i.e. if STEN 
changes state during a 387 DX MCP bus cycle, it 
should change state during the same CLK period as 
the NPS1#, NPS2, and CMDO# signals). 


3.1.14 MCP Select #1 (NPS1#) 
When active (along with STEN and NPS2) in the first 


period of a 386 DX CPU bus cycle, this signal indi- 


cates that the purpose of the bus cycle is to commu- 
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nicate with the MCP. This pin should be connected 
directly to the 386 DX CPU M/IO# pin, so that the 
MCP is selected only when the 386 DX CPU per- 
forms I/O cycles. Setup and hold times are refer- 
enced to CPUCLK2. | 


3.1.15 MCP SELECT #2 (NPS2) 


When active (along with STEN and NPS1#) in the 
first period of an 386 DX CPU bus cycle, this signal — 
indicates that the purpose of the bus cycle is to com- 
municate with the MCP. This pin should be connect- 
ed directly to the 386 DX CPU A3i1 pin, so that the 
MCP is selected only when the 386 DX CPU uses 
one of the I/O addresses reserved for the MCP 
(800000F8 or 800000FC). Setup and hold times are 
referenced to CPUCLK2. 


3.1.16 COMMAND (CMD0#) 
During a write cycle, this signal indicates whether an 


opcode (CMDO# active) or data (CMDO# inactive) 
is being sent to the MCP. During a read cycle, it 


indicates whether the control or status register 


(CMDO # active) or a data register (CMDO # inactive) © 
is being read. CMDO# should be connected directly 
to the A2 output of the 386 DX Microprocessor. Set- 
up and hold times are referenced to CPUCLK2. 


3.2 Processor Architecture 


As shown by the block diagram on the front page, 
the MCP is internally divided into three sections: the 
bus control logic (BCL), the data interface and con- 
trol unit, and the floating point unit (FPU). The FPU 
(with the support of the control unit which contains © 


the sequencer and other support units) executes all. 


numerics instructions. The data interface and control 
unit is responsible for the data flow to and from the 
FPU and the control registers, for receiving the in- 
structions, decoding them, and sequencing the mi- 
croinstructions, and for handling some of the admin- 
istrative instructions. The BCL is responsible for the 
386 DX CPU bus tracking and interface. The BCL is 
the only unit in the 387 DX MCP that must run syn- 


_chronously with the 386 DX CPU; the rest of the 


MCP can run asynchronously with respect to the | 
386 DX Microprocessor. 
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3.2.1 BUS CONTROL LOGIC 


The BCL communicates solely with the CPU using 
|/O bus cycles. The BCL appears to the CPU asa 
special peripheral device. It is special in two re- 
spects: the CPU initiates |/O automatically when it 
encounters ESC instructions, and the CPU uses re- 
served |/O addresses to communicate with the BCL. 
The BCL does not communicate directly with memo- 
ry. The CPU performs all memory access, transfer- 
ring input operands from memory to the MCP and 
transferring outputs from the MCP to memory. 


3.2.2 DATA INTERFACE AND CONTROL UNIT 


The data interface and control unit latches the data 
and, subject to BCL control, directs the data to the 
FIFO or the instruction decoder. The instruction de- 
coder decodes the ESC instructions sent to it by the 
CPU and generates controls that direct the data flow 


in the FIFO. It also triggers the microinstruction se- — 


quencer that controls execution of each instruction. 
lf the ESC instruction is FINIT, FCLEX, FSTSW, 
FSTSW AX, or FSTCW, the control executes it inde- 


CLOCK 
GENERATOR 
CLK2 


HLDA 
RESET yor” 
READY# LOCK# 
CLK2 BE3#—BEO¢ 
BS16# 

NA¥ 

HOLD A30-A3 
INT# 386™™px 


387'Mpx MCP CLOCK 
GENERATOR 
(OPTIONAL) 
— WAIT STATE J 
GENERATOR 
(OPTIONAL) 
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pendently of the FPU and the sequencer. The data 
interface and control unit is the one that generates 
the BUSY #, PEREQ and ERROR # signals that syn- 
chronize 387 DX MCP activities with the 386 DX 
CPU. It also supports the FPU in all operations that it 
cannot perform alone (e.g. exceptions handling, 
transcendental operations, etc.). 


3.2.3 FLOATING POINT UNIT 


The FPU executes all instructions that involve the 
register stack, including arithmetic, logical, transcen- 
dental, constant, and data transfer instructions. The 
data path in the FPU is 84 bits wide (68 significant 
bits, 15 exponent bits, and a sign bit) which allows 
internal operand transfers to be performed at very 
high speeds. 


3.3 System Configuration 


As an extension to the 386 DX Microprocessor, the 
387 DX Math Coprocessor can be connected to the 
CPU as shown by Figure 3.3. A dedicated communi- 


FROM OTHER PERIPHERALS 


.) 
¢ 
‘ 


READYO# 


387'Mpx MCP 
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Figure 3.3. 386™ DX Microprocessor and 387™ DX Math Coprocessor System Configuration 
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Table 3.4. Bus Cycles Definition 


cation protocol makes possible high-speed transfer 
of opcodes and operands between the 386 DX CPU 
and 387 DX MCP. The 387 DX MCP is designed so 
that no additional components are required for inter- 
face with the 386 DX CPU. The 387 DX MCP shares 
the 32-bit wide local bus of the 386 DX CPU and 
most control pins of the 387 DX MCP are connected 
directly to pins of the 386 DX Microprocessor. 


3.3.1 BUS CYCLE TRACKING 


The ADS# and READY # signals allow the MCP to - 


track the beginning and end of the 386 DX:CPU bus 
cycles, respectively. When ADS # is asserted at the 
same time as the MCP chip-select inputs, the bus 
cycle is intended for the MCP. To signal the end of a 
bus cycle for the MCP, READY # may be asserted 
directly or indirectly by the MCP or by other bus-con- 


trol logic. Refer to Table 3.4 for definition of the 


types of MCP bus cycles. 


3.3.2 MCP ADDRESSING 


The NPS1#, NPS2 and STEN signals allow the 
MCP to identify which bus cycles are intended for 
the MCP. The MCP responds only to I/O cycles 
when bit 31 of.the I/O address is set. In other words, 


the MCP acts as an I/O device in a reserved I/O. 


address space. 


Because Ag; is used to select the MCP for data 
transfers, it is not possible for a program running on 
the 386 DX CPU to address the MCP with an 1/O 
instruction. Only ESC instructions cause the 386 DX 
Microprocessor to communicate with the MCP. The 
386 DX CPU BS16# input must be inactive during 
I/O cycles when A3;j is active. 


3.3.3 FUNCTION SELECT | 
The CMD0# and W/R¥ signals identify the four 


kinds of bus cycle: control or status register read, 


data read, opcode write, data write. 


STEN NPS1# [ies CMDO# Bus Cycle Type 


MCP not selected and all 
outputs in floating state 

MCP not selected 

MCP not selected 

CW or SW read from MCP 

Opcode write to MCP 

Data read from MCP 

Data write to MCP 


3.3.4 CPU/MCP Synchronization 


The pin pairs BUSY #, PEREQ, and ERROR# are 


used for various aspects of synchronization between 
the CPU and the MCP. 


BUSY # is used to synchronize instruction transfer 
from the 386 DX CPU to the MCP. When the MCP 
recognizes an ESC instruction, it asserts BUSY #. 
For most ESC instructions, the 386 DX CPU waits 
for the MCP to deassert BUSY # before sepalng t the 
new opcode. 


The MCP uses the PEREQ pin of the 386 DX CPU to 
signal that the MCP is ready for data transfer to or 
from its data FIFO. The MCP does not directly ac- 
cess memory; rather, the 386 DX Microprocessor — 
provides memory access services for the MCP. 
Thus, memory access on behalf of the MCP always 


obeys the rules applicable to the mode of the 386. 


DX CPU, whether the 386 DX CPU Oe in real-ad- 
dress mode or protected mode. 


Once the 386 DX CPU initiates an MCP instruction 
that has operands, the 386 DX CPU waits for 
PEREQ signals that indicate when the MCP is ready 
for operand transfer. Once all operands have been 
transferred (or if the instruction has no operands) 
the 386 DX CPU continues program execution while 
the MCP executes the ESC instruction. 


In 8086/8087 systems, WAIT instructions may be 
required to achieve synchronization of both com- 
mands and operands. In 80286/80287, 386 DX Mi- 
croprocessor and 387 DX Math Coprocessor sys- 
tems, WAIT instructions are required only for oper- 
and synchronization; namely, after MCP stores to 
memory (except FSTSW and FSTCW) or loads from 
memory. Used this way, WAIT ensures that the val- 
ue has already been written or read by the MCP be- 
fore the CPU reads or changes the value. : 
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Once it has started to execute a numerics instruction 
and has transferred the operands from the 386 DX 
CPU, the MCP can process the instruction in parallel 
with and independent of the host CPU. When the 
MCP detects an exception, it asserts the ERROR # 
signal, which causes a 386 DX CPU interrupt. 


3.3.5 SYNCHRONOUS OR ASYNCHRONOUS 
MODES 


The internal logic of the 387 DX MCP (the FPU) can 
either operate directly from the CPU clock (synchro- 
nous mode) or from a separate clock (asynchronous 
mode). The two configurations are distinguished by 
the CKM pin. In either case, the bus control logic 
(BCL) of the MCP is synchronized with the CPU 
clock. Use of asynchronous mode allows the 386 
DX CPU and the FPU section of the MCP to run at 
different speeds. In this case, the ratio of the fre- 
quency of NUMCLK2 to the frequency of CPUCLK2 
must lie within the range 10:16 to 14:10. Use of syn- 
chronous mode eliminates one clock generator from 
the board design. 


3.3.6 AUTOMATIC BUS CYCLE TERMINATION 


In configurations where no extra wait states are re- 
quired, READYO# can be used to drive the 386 DX 
CPU READY # input. If this pin is used, it should be 
connected to the logic that ORs all READY outputs 
from peripherals on the 386 DX CPU bus. 
READYO# is asserted by the MCP only during |/O 
cycles that select the MCP. Refer to section 3.4 
“Bus Operation” for details. 


3.4 Bus Operation 


With respect to the bus interface, the 387 DX MCP is 
fully synchronous with the 386 DX Microprocessor. 
Both operate at the same rate, because each gener- 
ates its internal CLK signal by dividing CPUCLK2 by 
two. | | 


The 386 DX CPU initiates a new bus cycle by acti- 
vating ADS#. The MCP recognizes a bus cycle, if, 
during the cycle in which ADS # is activated, STEN, 
NPS1#, and NPS2 are all activated. Proper opera- 
tion is achieved if NPS1# is connected to the 
M/lIO# output of the 386 DX CPU, and NPS2 to the 
A31 output. The 386 DX CPU’s A31 output is guar- 
anteed to be inactive in all bus cycles that do not 
address the MCP (i.e. |/O cycles to other devices, 
interrupt acknowledge, and reserved types of bus 
cycles). System logic must not signal a 16-bit bus 
cycle via the 386 DX CPU BS16# input during I/O 
cycles when A931 is active. 
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During the CLK period in which ADS # is activated, 
the MCP also examines the W/R# input signal to 
determine whether the cycle is a read or a write cy- 
cle and examines the CMDO# input to determine 
whether an opcode, operand, or control/status reg- 
ister transfer is to occur. 


The 387 DX MCP supports both pipelined and non- 
pipelined bus cycles. A nonpipelined cycle is one for 
which the 386 DX CPU asserts ADS # when no oth- 
er MCP bus cycle is in progress. A pipelined bus 
cycle is one for which the 386 DX CPU asserts 
ADS# and provides valid next-address and control 
signals as soon as in the second CLK period after 
the ADS# assertion for the previous 386 DX CPU 
bus cycle. Pipelining increases the availability of the 
bus by at least one CLK period. The MCP supports 
pipelined bus cycles in order to optimize address 
pipelining by the 386 DX CPU for memory cycles. 


Bus operation is described in terms of an abstract 
state machine. Figure 3.4 illustrates the states and 
state transitions for MCP bus cycles: 


e T; is the idle state. This is the state of the bus 
logic after RESET, the state to which bus logic 
-returns after evey nonpipelined bus cycle, and 
the state to which bus logic returns after a series 
of pipelined cycles. 


¢ Tras is the READY# sensitive state. Different 
types of bus cycle may require a minimum of one 
or two successive Tprs states. The bus logic re- 
mains in Trs state until READY # is sensed, at 
which point the bus cycle terminates. Any number 
of wait states may be implemented by delaying 
READY #, thereby causing additional successive 
Trs States. . 


e Tp is the first state for every pipelined bus cycle. 


READY * ADS 


“ALWAYS” 


READY * ADS# 
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Figure 3.4. Bus State Diagram 
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The READYO # output of the 387 DX MCP indicates 
when a bus cycle for the MCP may be terminated if 
no extra wait states are required. For all write cycles 
(except those for the instructions FLDENV and 
FRSTOR), READYO# is always asserted in the first 
Trs state, regardless of the number of wait states. 
For all read cycles and write cycles for FLDENV and 
FRSTOR, READYO# is always asserted in the sec- 
- ond Trs state, regardless of the number of wait 
states. These rules apply to both pipelined and non- 
pipelined cycles. Systems designers must use 
READYO # in one of the following ways: 


1. Connect it (directly or through logic that ORs 
READY signals from other devices) to“ the 
READY # inputs of the 386 DX CPU and 387 DX 
MCP. 


2. Use it as one input to a wait-state generator. 


The following sections illustrate different types of 
MCP bus cycles. 


Because different instructions have different 
amounts of overhead before, between, and after op- 
erand transfer cycles, it is not possible to represent 


in a few diagrams all of the combinations of succes- 


sive operand transfer cycles. The following bus-cy- 
cle diagrams show memory cycles between MCP 
operand-transfer cycles. Note however that, during 
the instructions FLDENV, FSTENV, FSAVE, and 
FRSTOR, some consecutive accesses to the MCP 
do not have intervening memory accesses. For the 
timing relationship between operand transfer cycles 
and opcode write or other overhead activities, see 
Figure 3.8. 


3.4.1 NONPIPELINED BUS CYCLES 


Figure 3.5 illustrates bus activity for consecutive 
nonpipelined bus cycles. 


3.4.1.1 Write Cycle 


At the second clock of the bus cycle, the 387 DX 
MCP enters the Tas (READY #-sensitive) state. Dur- 
ing this state, the 387 DX MCP samples the 
READY # input and stays in this state as long as 
READY # is inactive. : 


In write cycles, the MCP drives the READYO# sig- 


nal for one CLK period beginning with the second 
CLK of the bus cycle; therefore, the fastest write 
cycle takes two CLK cycles (see cycle 2 of Figure 
3.5). For the instructions FLDENV and FRSTOR, 
however, the MCP forces a wait state by delaying 
the activation of READYO# to the second Tres cy- 
cle (not shown in Figure 3.5). 


When READY # is asserted the MCP returns to the 
idle state, in which ADS# could be asserted again 
by the 386 DX Microprocessor for the next cycle. 


3.4.1.2 Read Cycle 


At the second clock of the bus cycle, the MCP en-_ 
ters the Trs state. See Figure 3.5. In this state, the 
MCP samples the READY # input and ays: in this 
state as long as READY # is inactive. 


At the rising edge of CLK in the second clock period 
of the cycle, the MCP starts to drive the D31-D0 
outputs and continues to drive them as long as it 
stays in Trs state. 


In read cycles that address the MCP, at least one 
wait state must be inserted to insure that the 386 DX 
CPU latches the correct data. Since the MCP starts 
driving the system data bus only at the rising edge of 
CLK in the second clock period of the bus cycle, not 
enough time is left for the data signals to propagate 
and be latched by the 386 DX CPU at the falling edge 


_ of the same clock period. The MCP drives the 


READYO# signal for one CLK period in the third 
CLK of the bus cycle. Therefore, if the READYO# 
output is used to drive the 386 DX CPU READY # 
input, one wait state is inserted automatically. 


Because one wait state is required for MCP reads, 
the minimum is three CLK cycles per read, as cycle 
3 of Figure 3.5 shows. 


When READY # is asserted the MCP returns to the 
idle state, in which ADS# could be asserted again 
by the 386 DX CPU for the next cycle. The transition 
from Trg state to idle state causes the MCP to put 
the tristate D31—-D0 outputs into the floating state, 
allowing another device to drive the system data 
bus. | 
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Cycles 1 & 2 represent part of the operand transfer cycle for instructions involving either 4-byte or 8-byte operand loads. 
Cycles 3 & 4 represent part of the operand transfer cycle for a store operation. 
*Cycles 1 & 2 could repeat here or T; states for various non-operand transfer cycles and overhead. 


Figure 3.5. Nonpipelined Read and Write Cycles 


3.4.2 PIPELINED BUS CYCLES 


Because all the activities of the 387 DX MCP bus 
interface occur either during the Trs state or during 
the transitions to or from that state, the only differ- 
ence between a pipelined and a nonpipelined cycle 
is the manner of changing from one state to another. 


The exact activities in each state are detailed in the . 


previous section ‘““Nonpipelined Bus Cycles”. 


When the 386 DX CPU asserts ADS# before the 
end of a bus cycle, both ADS# and READY# are 
active during a Trs state. This condition causes the 
MCP to change to a different state named Tp. The 
MCP activities in the transition from a Trs state to a 
Tp state are exactly the same as those in the tran- 
sition from a Tras state to a T, state in nonpipelined 
cycles. 


Tp state is metastable; therefore, one clock period 
later the MCP returns to Trs state. In consecutive 
pipelined cycles, the MCP bus logic uses only Tps 
and Tp states. 


Figure 3.6 shows the fastest transition into and out 
of the pipelined bus cycles. Cycle 1 in this figure 
represents a nonpipelined cycle. (Nonpipelined write 
cycles with only one Trg state (i.e. no wait states) 
are always followed by another nonpipelined cycle, 
because READY # is asserted before the earliest 
possible assertion of ADS # for the next cycle.) 


Figure 3.7 shows the pipelined write and read cycles 
with one additional Trs states beyond the minimum 
required. To delay the assertion of READY# re- 
quires external logic. 
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3.4.3 BUS CYCLES OF MIXED TYPE | 


When the 387 DX MCP bus logic is in the Trg state, 
it distinguishes between nonpipelined and pipelined 
cycles according to the behavior of ADS# and 
READY #. In a nonpipelined cycle, only READY # is 
activated, and the transition is from Trg to idle state. 
In a pipelined cycle, both READY# and ADS# are 
active and the transition is first from Tps state to Tp 
state then, after one clock period, back to Tprs state. 


3.4.4 BUSY # AND PEREQ TIMING 
RELATIONSHIP 


Figure 3.8 shows the activation of BUSY# at the 
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tion after execution of the instruction is complete. 


When possible, the 387 DX MCP may deactivate 


BUSY # prior to the completion of the current in- 
struction allowing the CPU to transfer the next in- 
struction’s opcode and operands. PEREQ is activat- 
ed in. this interval. If ERROR# (not shown in the 
diagram) is ever asserted, it would occur at least six 
CPUCLK2 periods after the deactivation of PEREQ. 
and at least six CPUCLK2 periods before the deacti- 
vation of BUSY #. Figure 3.8 shows also that STEN 
is activated at the beginning of a bus cycle. 


beginning of instruction execution and its deactiva- 
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PIPELINED 
_ MCP WRITE 


CYCLE 1 
NON=PIPELINED 
MEMORY READ 


CYCLE 3 CYCLE 4 


PIPELINED NON=PIPELINED 
MEMORY READ MCP WRITE 


Tp Trs T | Trs T 


CPUCLK2 
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Cycle 1—Cycle 4 represent the operand transfer cycle for an instruction involving a transfer of two 32-bit loads in total. 
The opcode write cycles and other overhead are not shown. 

Note that the next cycle will be a pipelined cycle if both READY # and ADS# are sampled active at the. and of : aTrs 
state of the current cycle. | _ 


Figure 3.6. Fastest Transitions to and from Pipelined Cycles 
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| NOTE: 
1. Cycles between operand write to the MCP and storing result. 


Figure 3.7. Pipelined Cycles with Wait States 


OPCODE ‘| 41ST OPERAND 
WRITE 


CPUCLK2 | 


NOTE 1| NOTE 2 NOTE 3| NOTE 1 
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NOTES: 
1. Instruction dependent. 
| 2. PEREQ is an asynchronous input to the 386™ DX Microprocessor; it may not be asserted (instruction dependent). 


| 3. More operand transfers. 


4. Memory read (operand) cycle is not shown. 


Figure 3.8. STEN, BUSY # and PEREQ Timing Relationship 
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4.0 ELECTRICAL DATA 


4.1 Absolute Maximum Ratings* 


Case Temperature Tc oe 
Under Bias ................. —65°C to + 110°C 


Storage Temperature Paiebawiees =e to + 150°C 
Voltage on Any Pin with 

Respect to Ground ......... —0.5 to Voc +0.5V © 
Power Dissipation................0ce cece 1.5W 
4.2 DC Characteristics 


387™ DX MATH COPROCESSOR_ 


* WARNING: Stressing the device beyond the “Absolute 


_Maximum Ratings” may cause permanent damage. 


These are stress ratings only. Operation beyond the 


_ “Operating Conditions’”’ is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


Table 4.1. DC Specifications Tc = 0° to 85°C, Voc = 5V + 5% 


Symbot [Parameter [Min 


Input LO Voltage 
Input HI Voltage . 
CPUCLK2 Input LO Voltage 
CPUCLK2 Input HI Voltage 
Output LO Voltage | 
Output HI Voltage 
Supply Current 
NUMCLK2 = 32 MHz(4) 
NUMCLK2 = 40 MHz(4) 
NUMCLK2 = 50 MHz(4) 
NUMCLK2 = 66.6 MHz(4) 
Input Leakage Current | 
|/O Leakage Current 
Input Capacitance z 
|/O or Output Capacitance 
Clock Capacitance 


ee 


| Units Test Conditions 


(Note 1) 
(Note 1) 


(Note 2) 
(Note 3) | 
loc typ. = 95 mA 
loc typ. = 105 mA 
loc typ. = 125mA 
loc typ. = 150 mA 
OV< Vin < Voc 
0.45V < Vo < Voc 
fc = 1 MHz 

fe = 1 MHz 

fe = 1 MHz 


1. This parameter is for all inputs, including NUMCLK2 but excluding CPUCLK2.._ 


2. This parameter is measured at lo, as follows: 
data = 4.0 mA 
READYO# = 2.5 mA 
ERROR #, BUSY#, PEREQ = 2.5 mA 

3. This parameter is measured at lOH as follows: 
data = 1.0mA 
READYO# = 0.6 mA 
ERROR #, BUSY #, PEREQ = 0.6 mA 


4. loc is measured at steady state, maximum capacitive loading on the outputs, CPUCLK2 at the same ieaUeney 4 as 


NUMCLK2. 
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4.3 AC Characteristics 


Table 4.2a. Combinations of Bus Interface and Execution Speeds 


Functional Block 80387DX-16 | 80387DX-20 | 80387DX-25 | 80387DX-33 
Bus Interface Unit (MHz) 16 20 25 33 
Execution Unit (MHz) 16 20 25 33 

Table 4.2b. Timing Requirements of the Execution Unit 
Tc = OC to + 85°C, Vcc = 5V +5% 


t1 


NUMCLK2 Period 31.25 | 125 2.0V 
NUMCLK2| t2a_ | High Time | : 2.0V 
NUMCLK2] t2b | High Time : 3.7V 
NUMCLK2| t3a_ | Low Time : 16. 2.0V 
NUMCLK2! +t3b | LowTime : 0.8V 
NUMCLK2 | t4 Fall Time 3.7V to 0.8V 
NUMCLK2 | t5 Rise Time 0.8V to 3.7V 


Table 4.2c. Timing Requirements of the Bus Interface Unit 
Tc = OC to + 85°C, Voc = 5V +5% 
(All measurements made at 1.5V and C, = 50 pF unless otherwise specified) 


Period 
High Time 
High Time 
Low Time 
Low Time 
Fall Time 
Rise Time 


BUSY # (2) Float Time 
ERROR # (2) Float Time 
Float Time 


*Float condition occurs when maximum output current becomes less than I_o in magnitude. Float delay is not tested. 
tFor 25 MHz and 33 MHz, C, = 50 pF 


3 
3 
4 
4 
4 
4 
0 
8 
8 
3 
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Table 4.2c. Timing Requirements of the Bus Interface Unit (Continued) 
| Tc = O°C to + 85°C, Voc = 5V £5% 
(All measurements made at 1.5V and C; = 50 pF unless otherwise specified) 


Parameter Figure 
Reference 
Setup Time 

Hold Time 

Setup Time 

Hold Time 


READY # Setup Time 
READY # Hold Time 
CMDO # Setup Time 
CMDO # Hold Time 
NPS1 # Setup Time 
NPS2 

NPS1 # Hold Time 
NPS2 

STEN Setup Time 
STEN Hoid Time 


NOTES: | 
1. Not tested at 25 pF. . 
2. Float delay is not tested. Float condition occurs when maximum output current becomes Jess than I_o in magnitude. 


0 25 


. 240448-15 
NOTE: 
This graph will not be linear outside of the C, range 
shown. 


*nom - nominal value '240448-14 


NOTE: Figure 4.0b. Typical Output Rise Time vs Load 


seed graph will not be linear outside of the C, range Capacitance at Max Operating Temperature 
shown. | 


Figure 4.0a. Typical Output Valid Delay vs Load 
Capacitance at Max Operating Temperature > 
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Figure 4.1. CPUCLK2/NUMCLK2 Waveform and Measurement Points for 
Input/Output A.C. Specifications 
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Figure 4.2. Output Signals 
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Figure 4.3. Input and I/O Signals 


(CLK) (PH1 or PH2) (PH1 or PH2) (PH2) 
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RESET” | SW AQAA 
NOTE: : 


The second internal processor phase following RESET high to low transition is PH2. 


Figure 4.4. RESET Signal 
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Figure 4.5. Float from STEN 
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Table 4.3. Other Parameters 


RESETIN Inactive to 1st Opcode Write 5 


RESETIN 
RESETIN 
BUSY # 

BUSY #, ERROR# | t 


30 
131 
32 
33 : 
PEREQ, ERROR# | t34 
35 
136 
‘37 


Duration 


CPUCLK2 
4 CPUCLK2 


CPUCLK2 
CPUCLK2 


READY #, BUSY # t 
READY # 


Minimum Time from Opcode Write to 
Opcode/Operand Write 

Minimum Time from Operand Write to 
Operand Write 


READY # 


A= 


es 
15T OPCODE 1ST oOpERAND | 2ND oPERAND 
WRITE WRITE WRITE (NOTE 1) 


CPUCLK2 
(CLK) 
s * 


tz, -——> 


Ss, al 


[tt IC} |Sé 
i 


~ 


TIS Ie Tee 
LATTE 


a 
: 
z 


ERROR# 


oo 
Ow 
a 
~~ 
Oo 
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* In NUMCLK2’s 
** or last operand 


NOTE: | 
1. Memory read (operand) cycle is not shown. 


Figure 4.6. Other Parameters 
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5.0 387™ DX MCP EXTENSIONS TO 
"THE 386™ DX CPU. > | 
INSTRUCTION SET 


Instructions for the 387 DX MCP assume one of the 
five forms shown in the following table. In all cases, 
instructions are at least two bytes long and begin 
with the bit pattern 11011B, which identifies the 


ESCAPE class of instruction. Instructions that refer — 


to memory operands specify addresses using the 
_386 DX CPU addressing modes. | 


OP = Instruction opcode, possible split into two 
fields OPA and OPB — | 


MF = Memory Format 
00—32-bit real 
01—32-bit integer 

- 10—64-bit real 
11—16-bit integer 


P = Pop 
~ Q—Do not pop stack 
1—Pop stack after operation 


ESC = 11011 

d = Destination 
0—Destination is ST(0) 
1—Destination is ST(i) 


R XOR d = 0—Destination (op) Source 
R XOR d = 1—Source (op) Destination 


Instruction 


First Byte | Second Byte | Fields 


Optional 


DISP 


ST(i) = Register stack element / 
000 = Stack top 


001 = Second stack element 
= 7 


111 = Eighth stack element 


MOD (Mode field) and R/M (Register/Memory spec- 


ifier) have the same interpretation as the corre- 


- sponding fields of the 386 DX Microprocessor in- 


structions (refer to 386™ DX Microprocessor Pro- 


_grammer’s Reference Manua)). 


SIB (Scale Index Base) byte and DISP (displace- 


ment) are optionally present in instructions that have 


MOD and R/M fields. Their presence depends on 


the values of MOD and R/M, as for 386 DX Micro- 
processor instructions. > 


The instruction summaries that follow assume that 
the instruction has been prefetched, decoded, and is 
ready for execution; that bus cycles do not require 
wait states; that there are no local bus HOLD re- 


quest delaying processor access to the bus; and 
that no exceptions are detected during instruction 


execution. If the instruction has MOD and R/M fields 
that call for both base and index registers, add one 
clock. 
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387™ DX MCP Extensions to the 386™ DX CPU Instruction Set 


p Encoding, Clock Count Range’ 
Byte | Optional 32-Bit 32-Bit 64-Bit 16-Bit 
0 Bytes 2-6 Real integer Real Integer 


DATA TRANSFER 
FLD = Load@ 


integer/real memory to ST(0) ESC MF 1 MOD 000 R/M SIB/DISP 
Long integer memory to ST(0) ESC 111 MOD 101 R/M SIB/DISP 
Extended real memory to ST(0) ESC 011 MOD 101 R/M SIB/DISP 


9-18 26-42 16-23 42-53 
26-54 
12-43 


BCD memory to ST(0) 45-97 
ST(i) to ST(0) 7-12 

FST = Store 
ST(0) to integer/real memory 25-43 57-76 32-44 
ST(0) to ST(i) | 744 


FSTP = Store and Pop 


ST(0) to integer/real memory ESC MF 1 MOD 011 R/M SIB/DISP 
ST(0) to long integer memory ESC 111 MOD 111 R/M SIB/DISP 


25-43 57-76 32-44 


60-82 


ST(0) to extended real 46-52 
ST(0) to BCD memory 112-190 
ST(0) to ST(i) 3 7-11 

FXCH = Exchange 
ST(i) and ST(0) | 

COMPARISON 

FCOM = Compare 
Integer/real memory to ST(0) 13-25 34-52 14-27 


ST(i) to ST(0) ESC 000 41010 ST(i) 
FCOMP = Compare and pop 


Integer/real memory toST ESC MF 0 

ST() to STOO) 
FCOMPP = Compare and pop twice 

srt) ST) 
FTST = Test ST(0) 


13-21 


13-25 34-52 14-27 39-62 


13-21 


FXAM 


Examine ST(0) ESC 001 11100101 


CONSTANTS 


FLDZ = Load +0.0 into ST(0) 
FLD1 = Load +1.0 into ST(0) 
FLDPI = Load pi into ST(0) 
FLDL2T = Load loga(10) into ST(0) 


Shaded areas indicate instructions not available in 8087/80287. 


NOTE: 
a. When loading single- or double-precision zero from memory, add 5 clocks. 
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387T™ DX MCP Extensions to the 386™ DX CPU Instruction Set (Continued) 


Byte Byte Optional 32-Bit 32-Bit 64-Bit 16-Bit | 
0. 1 Bytes 2-6 Real integer Real Integer 


i 


instruction 


CONSTANTS (Continued) 


FLDL2E = Load logo(e) into ST(0) ESC 001 1110 1010 
FLDLG2 = Load logi9(2) into ST(0) ‘ESC 001 1110 1100 
FLDLN2 = Load logg(2) into ST(0) ESC 001 1110 1101 


ARITHMETIC 
FADD = Add 


Integer/real memory with ST(0) ESC MFO — | MODO0O00R/M ; SIB/ DISP 
ST(i) and ST(0) ‘ESC dPO 11000 ST(i) 


FSUB = Subtract 


_ Integer/real memory with ST(0) ESC MF 0 MOD 10 R R/M SIB/DISP 


34-56 15-34 


12-266 


12-29 34-56 15-34  38-64¢ 


ST(i) and ST(0) 12-264 
FMUL = Multiply | | | | ; | 
Integer/real memory with ST(0) 19-32 43-71 23-53 46-74 
ST(i) and ST(0). 17-508 
FDIV = Divide | 3 
Integer/real memory with ST(0) — | ESCMFO | MOD11RR/M]__ SIB/DISP_ | 101-114f 81-91 
ST(i) and ST(0) | 77-80h 
FSQRT! = Square root 97-111 


FSCALE = Scale ST(0) by ST(1) ESC 001 11111101 
FPREM = Partial remainder ESC 001 11114 1000 


44-82 


56-140 


FRNDINT = Round ST(0) ESC 001 ~ 11111100 
to integer 


FXTRACT = Extract components 


of ST(0) ESC 001 11110100 
FABS = Absolute value of ST(0) ESC 001 1110 0001 
FCHS = Change sign of ST(0) ESC 001 1110 0000 © 


Shaded areas indicate instructions not available in 8087/80287. 


NOTES: 

b. Add 3 clocks to the range when d = 1. 

c. Add 1 clock to each range when R = 1. 

d. Add 3 clocks to the range when d = 0. 

e. typical = 52 (When d = 0, 46-54, typical = 49). 
.f. Add 1 clock to the range when R = 1. 

g. 135-141-when R = 1. 

h. Add 3 clocks to the range when d = 1. 
i. -O < ST(O) < +. 
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387T™ DX MCP Extensions to the 386™ DX CPU Instruction Set (Continued) 


Encoding 


Byte Optional Clock Count Range 
0 Bytes 2-6 


TRANSCENDENTAL 


FPTANK = Partial tangent of ST(0) ESC 001 11110010 162-430i 
FPATAN = Partial arctangent ESC 001 1111 0011 250-420 


F2XM1! = 2ST(0) — 1 ESC 001 11110000 _ . 167-410 
FYL2Xm = ST(1) * logo(ST(0)) ESC 001 1111 0001 - 99-436 


FYL2XP1" = ST(1) * logo(ST(0) + 1.0) ESC 001 1111 1001 210-447 
PROCESSOR CONTROL . 


FINIT = Initialize MCP ESC 011 1110 0011 - 33 | 
FSTSW AX = Store status word ESC 111 1110 0000 13 


FLDCW = Load control word ESC 001 MOD 101 R/M SIB/DISP 19 
FSTCW = Store control word ESC 101 MOD 111 R/M SIB/DISP 15 
FSTSW = Store status word ESC 101 MOD 111 R/M SIB/DISP 15 


FCLEX = Clear exceptions ESC O11 11100010 11 


FSTENV = Store environment ~ 103-104 
FLDENV = Load environment 71 
FSAVE = Save state 375-376 
FRSTOR = Restore state 


FINCSTP = Increment stack pointer © ESC 001 11110111 
FDECSTP = Decrement stack pointer ESC 001 11110110 


FFREE = Free ST(i) ESC 101 | 11000ST(i) 
FNOP = No operations ESC 001 1101 0000 


Shaded areas indicate instructions not available in 8087/80287. 


NOTES: 

j. These timings hold for operands in the range |x| < 7/4. For operands not in this range, up to 76 additional clocks may be 
needed to reduce the operand. 

k.0 <|ST(O)| < 263, 

l. —1.0 < ST(O) < 1.0. 

m.0 < ST(0) < 0, —o% < ST(1) < +0, 

n. 0 < |ST(0)| < (2 — SQRT(2))/2, — 2 < ST(1) < +0. 
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APPENDIX A 
COMPATIBILITY BETWEEN 
THE 80287 AND THE 8087 


The 80286/80287 operating in Real-Address mode 
will execute 8086/8087 programs without major 
modification. However, because of differences in the 
handling of numeric exceptions by the 80287 MCP 
and the 8087 MCP, exception-handling routines may 
need to be changed. 


This appendix summarizes the differences between 
the 80287 MCP and the 8087 MCP, and provides 
details showing how 8086/8087 programs can be 
ported to the 80286/80287.. 


1. 


The MCP signals exceptions through a dedicated 
ERROR # line to the 80286. The MCP error signal 
does not pass through an interrupt controller (the 
8087 INT signal does). Therefore, any interrupt- 
controller-oriented instructions in numeric excep- 
tion handlers for the 8086/8087 should be delet- 
ed. 


. The 8087 instructions FENI/FNENI and FDISI/ 


FNDISI perform no useful function in the 80287. If 
the 80287 encounters one of these opcodes in its 
instruction stream, the instruction will effectively 
be ignored—none of the 80287 internal states will 
be updated. While 8086/8087. containing these 
instructions . may be executed on the 
80286/80287, it is unlikely that the exception- 
handling routines containing these instructions 
will be completely portable, to the 80287. 


. Interrupt vector 16 must point to the numeric ex- 


ception handling routine. 


. The ESC instruction address saved in the 80287 


includes any leading prefixes before the ESC op- 
code. The corresponding address saved in the 
8087 does not include leading prefixes. 


. In Protected-Address mode, the format of the 


80287’s saved instruction and address pointers is 
different than for the 8087. The instruction op- 
code is not saved in Protected mode—exception 
handlers will have to retrieve the opcode from 
memory if needed. 


. Interrupt 7 will occur in the 80286 when executing 


ESC instructions with either TS (task switched) or 
EM (emulation) of the 80286 MSW set (TS = 1 or 
EM = 1). If TS is set, then a WAIT instruction will 
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also cause interrupt 7. An exception handler 
should be included in 80286/80287 code to nan 
dle these situations. 


. Interrupt 9 will occur if the second or subsequent 


words of a floating-point operand fall outside a 
segment’s size. Interrupt 13 will occur if the start- 
ing address of a numeric operand falls outside a 
segment’s size. An exception handler should be 
included in 80286/80287 code to report these 
programming errors. 


. Except for the processor control instructions, all 


of the 80287 numeric instructions are automati- 
cally synchronized by the 80286 CPU—the 80286 
automatically tests the BUSY¥# line from the 
80287 to ensure that the 80287 has completed its 
previous instruction before executing the next 
ESC instruction. No explicit WAIT instructions are 


_ required to assure this synchronization. For the 


8087 used with 8086 and 8088 processors, ex- 
plicit WAITs are required before each numeric in- 


— struction to ensure synchronization. Although 


8086/8087 programs having explicit WAIT in- 
structions will execute perfectly on _ the 
80286/80287 without reassembly, these WAIT in- 
structions are unnecessary. 


. Since the 80287 does not require WAIT instruc- 


tions before each numeric instruction, the 


. ASM286 assembler does not automatically gener- 


ate these WAIT instructions. The ASM86 assem- 
bler, however, automatically precedes every ESC » 


‘instruction with a WAIT instruction. Although nu- - 


meric routines generated using the ASM86 as- 
sembler will generally execute correctly on the 
80286/80287, reassembly using ASM286 may re- 
sult in a more compact code image. 


The processor control instructions for the 80287 
may be coded using either a WAIT or No-WAIT 
form of mnemonic. The WAIT forms of these in- 
structions cause ASM286 to precede the ESC in- 
struction with a CPU WAIT instruction, in the iden- 
tical manner as does ASM86. 


Intel 387T™ DX MATH COPROCESSOR 


DATA SHEET REVISION REVIEW 


The following list represents the key differences be- 
tween this and the -002 versions of the 387™ Math 
Coprocessor Data Sheet. Please review this summa- 
ry carefully. 


1. Updated Icc max and one specs to reflect 
CHMOS IV process. 


2. Updated instruction clock counts. 
3. Change pins K3, L9 back to tie high insted of Vcc. 
4. Corrected typographical errors in A.C. Character- 
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istics table. Affected pins were NUMCLK2 Rise 
Time test conditions, READYO# Min Out Delay at 
16 MHz, and Max Data Out Delay at 25 MHz. 
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The 82395DX is completely software transparent, protecting the integrity of system software. High perform- 
ance, low cost and board space saving are achieved due to the high integration and new write buffer architec- 
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0.0 DESIGNER SUMMARY 
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Figure 0.1 - 82385DX 196 Lead PQFP Package Pin Orientation 
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Pin Signal aie Pin 83 Signal 


Signal 


50 VCC 148 VCC 
51 SA24 149 SD18° 
52 = SA25 150 SD17 
53 SA26 151 SD16 
54 SA27 152 SD15 
55 SA28 153 SD14 
56 SA29 154 SD13 
57 SA30 155 SD12 
58 SA31 156 SD11 
59 SBEO# 157 SD10 
60 SBE1# 158 SD9 
61 -SBE2# 159 vss 
62 SBE3# 160 VCC 
63 SLOCK# 161 SD8 
64 VCC 162 ~=sp7 
65 vss 163 SD6 
66 SBLAST # 164 SD5 
67 + SBREQ 165 SD4 
68 SHLDA 166 SD3 
69 SM/IO# 167. + ~=sb2 
70 SNENE # 168 SD1 
71 SD/C# 169 SDO 
72 SW/R# 170 READY0# 
73 SFHOLD# | 

74 BEO# 

75 BE1# 

76° BE2# 

77 BE3# 

78 - LOCK# 

79 M/lIO# 

80 W/R# 

81 D/C# 

82  SKEN# 

83 NPI# 

84 LBA# 

85 SWP # 

86 SNA# 

87 SBRDY # 

88 SRDY # 

89 SAHOLD 

90 SHOLD 

91 READYI# 

92 -SEADS # 

93 FLUSH # 

94 CONF # 

95 A31 

96  A30 

97 A29 


VSS 


Table 0.1 - 82395DX 196-Pin PQFP Pin Description 
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0.2 Quick Pin Reference 


What follows is a brief pin description. For more details refer to chapter 3. 


CLK2 This signal provides the fundamental timing for the 82395DxX. All external timing 
parameters are specified with respect to the rising edge of CLK2. 
Local Address Bus 


A2-31 A2-31 are the Local Bus address lines. These signals along with the byte enable 
signals, define the physical area of memory or input/output space accessed. 


BEO-3# i The byte enable signals are used to determine which bytes are accessed in partial 


cache write cycles. These signals are ignored for Cache Read Hit cycles. For all 
Local Bus Cycle Definition 


System Bus memory read cycles (except the last three cycle of a Line Fill), these 
signals are mirrored by the SBEO-3# signals. 


The write/read, data/code and memory/input-output signals are the primary bus 
definition signals directly connected to the 386 DX Microprocessor. They become 
valid as the ADS # signal is sampled active. The bus definition signals are not driven 
by the 386 DX Microprocessor during bus hold and follow the timing of the address 
bus. 


LOCK # The Local Bus LOCK # signal indicates that the current bus cycle is LOCK # ed. 
| LOCK #ed cycles are treated as non-cacheable cycles, except that LOCK # ed write 
. hit cycles update the cache. | 


Local Bus Control 


wail 
READYI# a 


READYO# 


The address status pin, an output of the 386 DX Microprocessor, indicates that new 
and valid information is currently available on the Local Bus. The signals that are valid 
when ADS ¢ is activated are: | | 

A2-31, BEO-3#, W/R#, D/C#, M/IO#, LOCK#, NPI# and LBA# 


This is the READY input signal seen by the Local Bus master. Typically it is a logical 
OR between the 82395DX generated READYO# and READY # signals generated by | 
other Local Bus masters (optional). It is used by the 82395DxX, along with the ADS # 

signal, to keep track of the 386 DX Microprocessor bus state. 


This is the Local Bus READY output that is used to terminate all types of 386 DX 
Microprocessor bus cycles, except for 386 DX Microprocessor Local Bus cycles which 
must be terminated by the Local Bus device being accessed. This signal is wired-OR 
with parallel 82395DX READYO # signals in a multi-82395DX system. 

The READYO # pin may serve as READY # for the 387 DX Math Coprocessor. 


RESET | 

RESET The RESET signal forces the 82395DX to begin execution at a known state. The | | 
RESET falling edge is used by the 82395DxX to set the phase of its internal clock 
identical to the 386 DX Microprocessors internal clock. RESET falling edge must 
satisfy the appropriate setup and hold times (T14, T15b) for proper chip operation. 
RESET must remain active for at least 1ms after the power supply and CLK2 input 
have reached their proper DC and AC specifications. | 

Configuration 

CONF # The activity on the CONF # input during and after RESET allows the 82395DxX to 


configure itself to operate in the specified address range. Refer to chapter 4 for 1, 2 or 
4 82395DXs operation. | 
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0.2 Quick Pin Reference (Continued) 


Local Data Bus 


DO-31 I/O | These are the Local Bus data lines of the 82395DX. They must be connected to the DO- 
31 pins of the 386 DX Microprocessor. 


Local Bus Decode Pins 


a 


4 


This is the Local Bus Access indication. It instructs the 82395Dx that the cycle currently 
in progress is targeted to a Local Bus device. This results in the cycle being ignored by 
the 82395DxX. The 387 DX Math Coprocessor is considered a Local Bus devcie but 
LBA# need not be generated, If LBA# is asserted at the falling edge of RESET 
accesses to Weitek 3167 Floating-Point Coprocessor address space are decoded as 
Local Bus cycles. Note that LBA# cycles have priority over all other cycle types. 


The No Post Input signal instructs the 82395DxX that the write cycle currently in progress 
must not be posted in the write buffer. NPI # is sampled at the falling edge of CLK at the 
end of T1 (see figure 5.1). 


Address Mask | 

A20M # Address bit 20 Mask when active, forces the A20 input as seen by the 82395DxX to logic | 
“0”, regardless of the actual value on the A20 input pin. A20M# emulates the address 
wraparound at 1 MByte which occurs on the 8086. This pin is asynchronous but must 
meet setup and hold times (t47 and t48) to guarantee recognition in a specific clock. It 
must be asserted two clock cycles before ADS # is sampled active (see figure 5.3). It 

| must be stable throughout Local Bus memory cycles. 

System Address Bus 

SA2-3 © © | These are the System Bus address lines of the 82395DX. When driven by the 82395Dx, 

SA4-31 [/O | these signals, along with the System Bus byte enables define the physical area of 


memory or input/output space being accessed. 
During bus HOLD or address HOLD, the 1/O signals serve as inputs for the cache 
invalidation cycle. 


These are the Byte Enable signals for the System Bus. The 82395DX drives these pins 
identically to BEO-—3 # in all System Bus cycles except Line Fills. In Line Fills these 

signals are driven identically to BEO-3# for the first read cycle of the Line Fill. oney are 
all driven active in the pemems cycles of the Line Fill. 


SBO-3# | 


System Bus Cycle Definition 
SW/R# 


The System Bus write/ read, data/code and memory/input-output signals are the 
System Bus cycle definition pins. When the 82395DxX is the System Bus master, it drives 
these signals identically to the 386 DX Microprocessor cycle definition encoding. 


_ | SLOCK# The System Bus LOCK # signal indicates that the current cycle is LOCK #ed. The 
82395DxX has exclusive access to the System Bus across bus cycle boundries until this 
1 signal is negated. The 82395DX does not acknowledge a bus HOLD request while this 
signal is asserted. The 82395DX asserts SLOCK # when the System Bus is available 


SD/C# 
SM/IO# 


and a LOCK #ed cycle was started on the Local Bus that requires System Bus service. 
SLOCK # is negated only after completion of all LOCK #ed System Bus cycles and 
negation of the LOCK # signal. | 
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0.2 Quick Pin Reference (Continued) 


[Symbol | Type] Functiom 


System Bus Control 


SADS# a 


SRDY # 
aa 
Bus Arbitration 

SBREQ ' 
| SHOLD i 
SHLDA ; 


The System Bus ADdress Status signal is used to indicate that new and valid information 
is currently being driven onto the System Bus. The signals that are valid when SADS # is 
driven low are: 

SA2-31, SBEO-3#, SW/R#, SD/C#, SM/IO# and SLOCK# 


The System Bus ReaDY # signal indicates that the current System Bus cycle is complete. 
When SRDY # is sampled asserted it indicates one of two things. In response to a read 
request it indicates that the external system has presented valid data on the system data 
bus. In response to a write request it indicates that the external system has accepted the 
82395DX’s data. This signal is ignored when the System Bus is in STi, STH, ST1 or ST1P 
states. | 

At the first read cycle of a Line Fill SRDY #, SBRDY # and SNA# determine if the Line Fill 
will proceed as a burst/non-burst, pipelined/non-pipelined Line Fill. 

Once a burst Line Fill has started, if SRDY # is returned in the 2nd or 3rd DW, the burst 
Line Fill will be interrupted and the cache will not be updated. The ist DW will already 
have been transferred to the CPU. In the 4th DW of a Line Fill both SRDY # and 

SBRDY # have the same affect. They indicate the end of the Line Fill. 


The System Bus Next Address signal, when active, indicates that a pipelined address _ 
cycle will be executed. It is sampled by the 82395DxX at the rising edge of CLK in ST2 and 
ST1P cycles. If this signal is sampled active then burst Line Fills are disabled. This signal 

is ignored once a burst Line Fill begins. 


The System Bus REQuest signal is the internal cycle pending signal. This indicates to the 
outside world that internally the 82395DX has generated a bus request (due to the CPU’s 
request that requires access to the System Bus). It is generated whether the 82395DX 
owns the bus or not and can be used to arbitrate among the various masters on the 
System Bus. If the bus is available and the cycle starts immediately this signal will not be 
activated for cache read miss cycles. 


The System Bus HOLD request indicates that another master must have complete 
control of the entire System Bus. When SHOLD is sampled asserted the 82395DX 
completes the current System Bus cycle or sequence of LOCK # ed cycles, before driving 
SHLDA active. In the same clock that SHLDA went active all the System Bus outpus and 
I/O pins are floated (with the exception of SHLDA and SBREQ). The 82395DxX stays in 

this state until SHOLD is negated. SHOLD is recognized during RESET. 


The System Bus HOLD Acknowledge signal is driven active by the 82395DxX in response 
to a hold request. It indicates that the 82395DxX has given the bus to another System Bus 
master. It is driven active in the same clock that the 82395DxX floats it’s System Bus. 
When leaving a bus HOLD, SHLDA is driven inactive and the 82395DX resumes driving 
the bus in the same clock. The 82395DxX is able to support CPU Local Bus activities 
during System Bus HOLD. 
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0.2 Quick Pin Reference (Continued) 


Symbol 
Bus Arbitration (Continued) | 7 
SFHOLD # 


The System Bus Fast HOLD Request signal indicates that another master needs 
immediate access to the System Bus. In response to SFHOLD # being sampled active, 
the 82395DX stops driving (in the next clock) the System Bus output and I/O pins 
(except SHLDA and SBREQ). Because the 82395DX always stops driving the System 
Bus in response to SFHOLD # active, no acknowledge is required. The System Bus _ 
output and I/O pins remain in the high impedance state until SFHOLD # is negated. 

It is the responsibility of the system designer to guarantee that bus cycles that are in 
progress when SFHOLD # is asserted are terminated correctly. This pin is recognized 
during RESET. | 


Burst Control 


SBRDY # i 


SBLAST # 


The System Bus Burst ReaDY signal performs the same function during a burst cycle 
that SRDY # does in a non-burst cycle. SBRDY # asserted indicates that the external 
system has presented valid data on the data pins in response to a burst Line Fill cycle. 
This signal is ignored when the System Bus is at STi, STH, ST1 or ST1P states. 

Note that in the fourth bus cycle of a Line Fill, SBRDY # and SRDY # have the same 
effect on the 82395DX. They indicate the end of the Line Fill. For all cycles other than 
burst Line Fills, SBRDY # and SRDY # have the same effect on the 82395DxX. 


The System Bus Burst LAST cycle indicator signal indicates that the next time 
SBRDY # is returned the burst cycle is complete. It indicates to the external system 
that the next SBRDY # returned is treated as a normal SRDY # by the 82395DX. 
Another set of addresses will be driven with SADS # or the System Bus will go idle. 
SBLAST # is normally active. In a cache read miss cycle, which may proceed as a Line 
Fill, SBLAST # starts active. After determining whether or not the cycle is cacheable - 
via SKEN#, SBLAST # is driven inactive. If it is a cacheable cycle, and SBRDY # . 
terminates the first DW of the Line Fill, a burst Line Fill, SBLAST # will be driven active 
when the data is valid for the fourth DW of the Line Fill. If SRDY # terminates the first 
DW of the Line Fill, a non-burst Line Fill, SBLAST # is driven active in the cycle where 
SRDY # was sampled active. 


Cache Invalidation 
SAHOLD | 


SEADS# ' 


The System Bus Address HOLD request allows another bus master access to the 
address bus of the 82395Dx. This is to indicate the address of an external cycle for 
performing an internal cache directory lookup and invalidation cycle. In response to 
this signal the 82395DxX stops driving the System Bus address pins in the next cycle. 
No HOLD Acknowledge is required. Other System Bus signals can remain active 
during address hold. The 82395DxX does not initiate another bus cycle during address 
hold. This pin is recognized during RESET. 


The System Bus External ADress Strobe signal indicates that a valid external address 
has been driven onto the 82395DX System Bus address pins. This address will be 
used to perform an internal cache invalidation cycle. The maximum nvaligalion cycle 
rate is one every two clock cycles. 
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0.2 Quick Pin Reference (Continued) 


Temwot [ee Rng 
Cache Control 
FLUSH # 


The FLUSH # pin, when sampled active for four clock cycles or more, causes the 
82395DxX to invalidate its entire TAG array. In addition, it is used to configure the 
82395DxX to enter various test modes. For details refer to chapter 7. This signal is 
asynchronous but must meet setup and nolan times to guarantee recognition in any 
specific clock. , 


System Data Bus 


The System Bus Data lines of the 82395DX must be driven with appropriate setup and 
hoid times for proper operation. These signals are driven by the 82395DxX only during 
write cycles. 


System Bus Decode Pins 


SKEN # : 


SWP # 


The System Cacheability ENable signal is used to determine if the current cycle running 
on the System Bus is cacheable or not. When the 82395DX generates a read cycle, 
SKEN # is sampled one clock before the first SBRDY # or SRDY # or one cycle before 
the first SNA# is sampled active (see chapter 6). If SKEN # is sampled active the cycle 
will be transformed into a Line Fill. Otherwise, the cache and cache direciory will be 
unaffected. Note that SKEN # is ignored after the first cycle in a Line Fill. SKEN # is 
ignored for all System Bus cycles except for cache read miss cycles. 


- The System Write Protect indicator signal is used to determine whether the current 
System Bus Line Fill cycle is write protected or not. In non-pipelined cycles, SWP # is 
sampled with the first SRDY # or SBRDY # of the Line Fill. In pipelined cycles, SWP # is 

| sampled one clock phase after the first SNA# is sampled active (see figures 6.9-10). 

| The Write Protect bit is sampled together with the TAG of each line in the 82395DX 

Cache Directory. In every cacheable write cycle the Write Protect bit is read. If active, 

the cycle will be a write protected cycle which is treated like a cacheable write miss 

cycle. It is buffered and it does not update the cache even if the addressed location is 
present in the cache. 


Design Aides 


| SNENE # The System NExt NEar indicator signal indicates that the current System Bus memory 
cycle is to the same 2048 byte area as the previous memory cycle. Address lines A11- 
31 of the current System Bus memory cycle are identical to address lines A11—31 of the 
previous memory cycle. 

SNENE # can be used in an external DRAM system to run CAS # only cycles, thereby 
increasing the throughput of the memory system. SNENE # is valid for all memory 
cycles, and indicates that the current memory cycle is to the same 2048 byte area, even 
if there were idle or non-memory bus cycles since the last System Bus memory cycle. 
For the first cycle after the 82395DX has exited the HOLD state, or after SAHOLD was 
deactivated, this pin will be inactive. | 
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1.0 82395DX FUNCTIONAL 
‘OVERVIEW 


1.1 Introduction 


The primary function of a cache is to provide local 
storage for frequently accessed memory locations. 
The cache intercepts memory references and han- 
dies them directly without transferring the request to 
the System Bus. This results in lower traffic on the 
System Bus and decreases latency on the local bus. 
This leads to improved performance for a processor 
on the Local Bus. By providing fast access to fre- 
quently used code and data, the cache is able to 
reduce the average memory access time of the 386 
DX Microprocessor based system. 


The 82395DxX is a single chip cache subsystem spe- 
cifically designed for use with the 386 DX Microproc- 
essor. The 82395DX integrates 16KB cache, the 
Cache Directory and the Cache Control naa onto 
one chip. 


The 82395DX is expandable such that larger cache 
sizes are supported by cascading 82395DxXs. In a 
single 82395DX system, the 82395DX can map 4 
Giga bytes of main memory into a 16KB cache. In 
the maximum configuration of a four 82395DX sys- 
tem, the 4 Giga bytes of main memory are mapped 
into a 64KB cache. The cache is unified for code 
and data and is transparent to application software. 
The 82395DxX provides a cache consistency mecha- 


386™ px 
Microprocessor 


= Optional 


82395DX 


ADVANCE INFORMATION 


nism which guarantees that the cache has the most 
recently updated version of the main memory. Con- 
sistency support has no performance impact on the 
386 DX Microprocessor. Section 1.2 covers all the 
82395DX features. 


The 82395DX cache architecture is similar to the 


i486 Microprocessor’s on-chip cache. The cache is 
four Way set associative with Pseudo LRU replace- 
ment algorithm. The line size is 16B and a full line is 
retrieved from the memory every cache miss. A TAG 
is associated with every 16B line. 


The 82395DxX architecture allows for cache read hit 
cycles to run on the Local Bus even when the Sys- 
tem Bus is not available. 82395DX incorporates a 
new write buffer cache architecture, which allows 
the 386 DX Microprocessor to continue operation 
without waiting for write cycles to actually upeale the 
main memory. 


A detailed description of the cache operation and 


parameters is included in chapter 2. 


The 82395DX has an interface to two electrically 
isolated busses. The interface to the 386 DX Micro- 
processor bus is referred to as the Local Bus (LB) 
interface. The interface to the main memory and oth- 
er system devices is referred to as the 82395DX 
System Bus (SB) interface. The SB interface emu- 
lates the 386 DX Microprocessor. The SB interface, 
as. does the 386 DX Microprocessor, can be pipe- 
lined. 


Local Bus - 


System Bus 


290382-3 


Figure 1.1 - System Block Diagram 
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In addition, it is enhanced by an optional burst mode 
for Line Fills. The burst mode provides faster line fills 
by allowing consecutive read cycles to be executed 
at a rate of up to one DW per clock cycle. Several 
bus masters (or several 82395DXs) can share the 
same System Bus and the arbitration is done via the 
SHOLD/SHLDA/SBREQ mechanism (similar to the 
i486 Microprocessor) along with SFHOLD#. Using 
these arbitration mechanisms, the 82395DX is able 
to support a multiprocessor system (multi 386 DX 
Microprocessor/82395DX sysiems sharing the 
same memory). 


Cache consistency is maintained by the SAHOLD/ 
SEADS# snooping mechanism, similar to the i486 
microprocessor. The 82395DxX is able to run a zero 
wait state 386 DX Microprocessor non-pipelined 
read cycle if the data exists in the cache. Memory 
write cycles can run with zero wait states if the write 
buffer is not full. 


The 82395DX cache organization provides a higher 
hit rate than other standard configurations. The 
82395DX, featuring the new high performance write 
buffer cache architecture, provides full concurrency 
between the electrically isolated Local Bus and Sys- 
tem Bus. This allows the 82395DX to service read 
hit cycles on the Local Bus while running line fills or 
buffered write cycles on the System Bus. Moreover, 
the user has the option to expand his cache system 
up to 64KB. 


1.2 Features 


1.2.1 82385-LIKE FEATURES 


e The 82395DX maps the entire physical address 
range of the 386 DX Microprocessor (4GB) into 
16KB, 32KB, or 64KB cache (with one, two, or 
four 82395DXs respectively). 


e Unified code and data cache. 


e Cache attributes are handled by hardware. Thus 
the 82395DX is transparent to application soft- 
ware. This preserves the integrity of system soft- 
ware and protects the users software investment. 


¢ Double Word, Word and Byte writes, Double 
Word reads. 


e Zero wait states in read hits and in buffered write 
cycles. All 386 DX Microprocessor cycles are 
non- pipelined. (Note: The 386 DX Microproces- 
sor must never be pipelined when used with the 
82395DX - NA# must be tied to Vcc). 


e A hardware cache FLUSH# option. The 
82395DxX will invalidate all the Tag Valid bits in 
the Cache Directory and clear the System Bus 
line buffer when FLUSH # is activated for a mini- 
mum of four CLK’s. The line buffer is also 
FLUSH # ed. 
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e The 82395DX supports non-cacheable accesses. 
The 82395DX internally decodes the 387 DX 
Math Coprocessor accesses as Local Bus cy- 
cles. 


e The system bus interface emulates a 386 DX Mi- 
croprocessor interface. 


e The 82395DX supports pipelined and non-pipe- 
lined system interface. 


e Provides cache consistency (snooping): The 
82395DX monitors the System Bus address via 
SEADS# and invalidates the cache address if 
the System Bus address matches a cached loca- 
tion. 


1.2.2 NEW FEATURES 


e 16KB on chip cache arranged in four banks, one 
bank for each way. In Read hit cycles, one DW is 
read. In a write hit cycle, any byte within the DW 
can be written. In cache fill cycle, the whole line 
(16B) is written. This large line size increases the 
hit rate over smaller line size caches. 


¢ Cache architecture similar to the i486 Microproc- 
essor cache: Four Way SET associative with 
Pseudo LRU replacement algorithm. Line size is 
16B and a full line is retrieved from memory for 
every cache miss. Tag, Tag Valid Bit and Write 
Protect Bit are associated with every Line. 


e New write buffer architecture with four DW deep 
write buffer. provides zero wait state memory write 
cycles. 1/0, Halt/Shutdown and LOCK # ed writes 
are not buffered. 


e Concurrent Line Buffer Cacheing: The 82395DX 
has a line buffer that is used as additional memo- 
ry. Before data gets written to the cache memory 
at the completion of a Line Fill it is stored in this 
buffer. Cache hit cycles to the line buffer can oc- 
cur before the line is written to the cache. 


e Expandable: two 82395DXs support 32KB cache 
memory, four 82395DXs support 64KB cache 
memory. This gives the user the option of config- 
uring a system to meet their own performance 
requirements. 


e In 387 DX Math Coprocessor accesses, the 
82395DX drives the READYO # in one wait state 
if the READYI# was not driven in the previous 
clock. 


Note that the timing of the 82395’s READYO# 
generation for 387 DX Math Coprocessor cycles 
is incompatible with 80287 timing. 


e The 82395DX optionally decodes CPU accesses 
to Weitek 3167 Floating-Point Coprocessor ad- 
dress space (COOOOOO0OOH-—C1FFFFFFH) as Lo- 
cal Bus cycles. This option is enabled or disabled 
according to the LBA# pin value at the falling 
edge of RESET. 


intel 


An enhanced System Bus interface: 


a) Burst option is supported in line-fills similar to 


the i486 Microprocessor. SBRDY# (System 
Burst READY) is provided in addition to 
SRDY #. A burst is always a 16 byte cache 
update which is equivalent to four DW cycles. 
The i486 Microprocessor burst order is sup- 
ported. | | 

b) System cacheability attribute is provided 
(SKEN#).. SKEN# is used to determine 
whether the current cycle is cacheable. It is 
used to qualify Line Fill requests. 


c) SHOLD/SHLDA/SBREQ system bus arbitra-: 


tion mechanism is supported, the same as in 
the i486 Microprocessor. A Multi 386 DX/ 
82395DX cluster can share the same System 
Bus via this mechanism. 


d) SNENE# output (Next Near) is provided to 
simplify the interface to DRAM controllers. 
DRAM page size of 2K is supported. 


e) Fast HOLD function (SFHOLD #) is provided. 
This function allows for muluproees =r sup- 
port. 


f) Cache invalidation cycles siported: via 
SEADS#. This is the mechanism use to pro- 
vide cache coherency. | 


Full Local Bus/System Bus concurrency is at- 
tained by: 7 


a) Servicing cache read hit cycles on the Local 
Bus while completing a Line Fill on the System 
Bus. The data requested by the 386 DX Micro- 
processor was provided over the local bus as 
the first part of the Line Fill. 


_ b) Servicing cache read hit cycles on the Local - 


Bus while executing buffered write cycles on 
the system bus. 


c) Servicing cache read hit cycles on the Local 
Bus while another bus master is running (DMA, 
other 386 DX Microprocessor, 82395DX, i486 
Microprocessor, etc ...) on the System Bus. 


d) Buffering write cycles on the Local Bus’ while | 


the system bus is executing other cycles. 
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_@ Write protected areas are supported by the 


SWP# input. This enables caching of ROM 
space or shadowed ROM space. 


¢ No Post Input (NPI#) provided for disabling of 
write buffers per cycle. This option supports 
memory mapped !/O designs. 


¢ A20M# input provided for emulation of 8086 ad- 
dress wrap-around. 


¢ SRAM test mode, in which the TAGRAM and the 
cache RAM are treated as standard SRAM, is 
provided. A Tristate Output test mode is also pro- 
vided for system debugging. In this mode the 
82395Dx is isolated from the other devices in the 
board by floating all its outputs. 


® Single chip, 196 lead PQFP package, 1 micron 
CHMOS-IV technology. 


2.0 82395DX CACHE SYSTEM 
DESC LION 


2.1 82395DX Cache Organization 


The on chip cache memory is a unified code and 
data cache. The cache organization is 4 Way SET 
Associative and each Line is 16 bytes wide (see Fig- 
ure 2.1). The 16K bytes of cache memory are logi- 
cally organized as 4 4KB banks (4: 1 bank for each 
Way). Each bank contains 256 16B lines (256: 1 line 


‘for each SET). 


The Cache Directory is used to determine whether 
the data in the cache.memory is valid for the ad- 
dress being accessed. The Cache Directory con- 
tains 256 TAG’s (each TAG is 22-bits wide) for each 
Way, for a total of 1K TAG’s (See Figure 2.2). With 
each 20 bit TAG Address there is a TAG Valid Bit 
and a Write Protect bit. The Cache Directory also 
contains the LRU bits. The LRU bits are used to 
determine which Way to replace whenever the 
cache needs to be updated with a new line and all 
four ways contain data. 


Table 2.1 lists the Sees cache organization. 
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Table 2.1 - 82395DX Cache Organization 


82395DX 
maemo nement | Size/aty 


Taek [Toainanberoitags—OSSCSCSCSC~“‘~*~*“‘“~*~“~S~S~*S 
[set | 266 [Cache DiecioyOfsot——SOSCSC~“‘~*“~*~*~“‘“*~*~“~S~S~S*Y 
funy | 66 [ats per SET actress ——=OSC*C~“‘“‘*S~*~“‘“‘*S*S*S*S*S~S 
[way +| 4 “(ata oper Set address ——OSCSCSC~“‘“‘*~*~“‘“~*~S~S~*S 
16B 
16B ; 


Sector Size | 16B | 4 DW’s, one line per sector 
16KB Expandable to 64KB | 


Cache Directory TAG address, TAG Valid Bit, and Write Protect Bit for each Way for each 
SET address (256 SET’s < 4 Ways), and LRU bits. 


TAG Valid Bit 1K 1 for each TAG in the cache directory, indicates valid data is in the cache 
- memory. | 
Write Protect Bit 


| Alt AA A31 A12 | | A3 A2 
: DW SELECT 


TAG 

SET 

LRU ji 

Way 

flinesee [| 108 [40WsSOC~—“—SC“CSCS*SCSCS 
[SectorSize 


SET ADDRESS TAG ADDRESS 


1 bit 
1 bit — | [—*16 byte line<— _ 
a ; DW | DW | DW ; DW 
See Wana ae ee 
ytes Data SRAM wae 


SET 255 


Figure 2.1 - 82395DX Cache Organization 
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1K 1 for each TAG in the cache directory, indicates that the address is write 
protected. | . 


TAG VALID 
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| TAG 
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20 bits 20 bits 


SET 254 
SET 255 


2.1.1 82395DX CACHE STRUCTURE AND 
TERMINOLOGY 7% 


| sf WRITE PROTECT 
ee 5 


| 


Figure 2.2 - 82395DX Cache Directory Organization 


A detailed description of the 82395DX cache param- 
eters are defined here. 


A Line is the basic unit of data transferred between 


the cache and main memory. In the 82395DX each . 


' Line is 16B. A Line is also known as a transfer block. 
The decision of a cache “hit or miss” is determined 
on a per Line basis. A cache hit results when the 


_. TAG address of the current address being accessed 


‘matches the TAG address in the Cache Directory 


(see Figure 2.3) and the TAG Valid bit is set. The » 


82395DX has 1K Lines. 
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TAG 
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TAG 
ADDRESS 


WRITE PROTECT 
TAG VALID 
WRITE PROTECT 
TAG VALID 


20 bits Trt i} [20 bits Jit 
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A TAG is a storage element of the Cache Directory 
with which the hit/miss decision is made. The TAG 
consists of the TAG address (A31-A12), the TAG 
Valid bit and the Write Protect bit. Since many ad- 
dresses map to a single line, the TAG is used to 
determine whether the data associated with the cur- 
rent address is present in the cache memory (a 


cache hit). This is done through a comparison of the 


TAG address bits of the current address and the 
contents of the Cache Directory, along with the TAG © 
Valid bit. Each line in the cache memory has a TAG 
associated with it. | 


\ 
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386 DX Microprocessor Address A2~A31 


DW Select 
A2~A3 


SET Address 
A4-A11 


A0-07 


SRAM 
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TAG Directory 


DO-D19 
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Data Array 
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Figure 2.3 - 82395DX Cache Hit Logic 


A TAG Valid Bit is associated with each TAG ad- 
dress in the Cache Directory. It determines if the 
data held in the cache memory for the particular 
TAG address is valid. It is used to determine whether 
the data in the cache is a match to data in main 
memory. 


A Write Protect Bit is also associated with each 
TAG address in the Cache Directory. This field de- 
termines if the cache memory can be written to. It is 
set by the SWP# pin during Line Fill cycles (see 
chapter 6). 


A SET address is a decoded portion of the Local 
Bus address that maps to 1 TAG address per Way in 


the Cache Directory. All the TAG’s associated with a 
particular SET are simultaneously compared with 
the TAG field of the bus address to make the hit/ 
miss decision. The 82395DX provides 256 SET ad- 
dresses, each SET maps to four lines in the cache 
memory. 


The term Way as in 4 Way SET Associative de- 
scribes the degree of associativity of the cache sys- 
tem. Each Way provides TAG Address, TAG Valid 
bit, and Write Protect bit storage, 1 entry for each 
SET address. A simultaneous comparison of one 
TAG address from each Way with the bus address is 
done in order to make the hit/miss decision. The 
82395DxX is 4 Way SET Associative. | 
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Other key 82395DX features include: 


Cache Size - The 82395DX contains 16KB of cache 
memory. This can be expanded by connecting two 
or four 82395DX’s in parallel to get up to 64KB of 
cache memory. Expanding the cache in this way re- 
sults in an increased number of Tags with a constant 
number of lines per Tag. The cache is organized as 
four banks of 4KB. Each of the four banks corre- 
sponds to a particular Way. 


Update Policy - The update policy deals with how 
main memory is updated when a cacheable write 
cycle is issued on the Local Bus. The 82395DX sup- 
ports the write buffer policy, similar to the write 
through policy, which means that main memory is 
always updated in every write cycle. However, the 
cache is updated only when the write cycle hits the 
cache. Also, the 82395DxX is able to cache write pro- 
tected areas, e.g. ROMs, by preventing the cache 
update if the write cycle hits a write protected line. A 
write cycle to main memory is buffered as explained 
in chapter 6. 


Replacement - When a new line is needed to up- 
date the cache, the Tag Valid bits are checked to 
see if any of the four ways are available. If they are 
all valid it is necessary to replace an old line that is 
already in the cache. In the 82395DxX, the Pseudo 
LRU (least recently used) algorithm is adopted. The 
Pseudo LRU algorithm targets the least recently 
used line associated with the SET for replacement. 
(Pseudo LRU is described in section 2.2.). : 


Consistency - The 82395DX implements hooks for 
a consistency mechanism. This is to guarantee that 


Are all four fines in the set valid? 


Way O or Way 1 least recently used 


Replace Way 0 Replace Way 1 
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in systems with multiple caches (and/or with multi- 


~ ple bus masters) all processor requests result in re- 
_ turning correct and consistent data. Whenever a 


system bus master performs memory accesses to | 
data which also exists in the-cache, the System Bus 


‘master can invalidate that entry in the 82395DxX. 


This invalidation is done by using SEADS # (descrip- 
tion in chapter 6). 


The invalidation is performed by marking the TAG as 
invalid (the TAG Valid bit is cleared). Thus, the next 
time a Local Bus request is made to that location, 
the 82395DX accesses the main meee to get the 
most recent copy of the data. 


} 


2.2 Pseudo LRU Algorithm 


‘When a line needs to be placed in the internal cache 


the 82395DxX first checks to see if there is a non- 
valid line in the SET that can be replaced. The validi- 
ty is checked by looking at the TAG Valid bit. The 
order that is used for this check is Way 0, Way 1, 
Way 2, and Way 3. If all four lines associated with 
the SET are valid, a pseudo Least Recently Used 
algorithm is used to determine which line will be re- 
placed. If a non-valid line is found, that line is 
marked for replacement. All the TAG Valid bits are 
cleared when the 82395DX is RESET or when the 
cache is FLUSH # ed. Three bits, BO, B1, and B2, are 
defined for each of the 256 SETs. These bits are 
called the LRU bits and are stored in the cache di- 
rectory. The LRU bits are updated for ravely acces: 
to the cache. 


If the most recent access to the cache was to Way 0 
or Way 1 then BO is set to 1. 


——» Replace non=valid line 


Way 2 or Way 3 least recently used 


Replace Way 2 Replace Way 3 
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Figure 2.4 - Pseudo LRU Decision Tree 
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Figure 2.5 - Four Way Set Associative Cache Organization 
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BO is set to 0 if the most recent access was to Way 2 


or Way 3. If the most recent access to Way 0 or Way _ 


1 was to Way 0, B1 is set to 1. Else B1 is set to 0. If 
the most recent access to Way 2 or Way 3 was to 
Way 2, B2 is set to 1. Else B2 is set to 0. See Table 
2.2. — 


The Pseudo LRU algorithm works in the following 
manner. When a line must be replaced, the cache 
will first select which of Way 0 and Way 1 or Way 2 
and Way 3 was least recently used. Then the cache 
will select which of the two lines was least recently 
used and mark it for replacement. The decision tree 
is shown in Figure 2.4. When the 82395DX is RESET 
or the cache is FLUSH#ed all the LRU bits are 
cleared along with the TAG Valid bits. 


2.3 Four Way Set Associative Cache 
Organization 


The 82395DxX is a four Way SET Associative cache. 
Figure 2.5 shows the 82395DX’s cache organiza- 
tion. For each of the 256 SET’s there are four 
TAG’s, one for each Way. The address currently be- 
ing accessed is decoded into the SET and TAG ad- 
dresses. If the access was to address 00555004h 


(SET=001,TAG =00555h), the four TAG’s in the | 


Cache Directory associated with SET 001 are simul- 
taneously compared with the TAG of the address 
being accessed. The TAG Valid bits are also 
checked. If the TAG’s match and the TAG Valid bit is 
set, the access is a hit to the Way where the hit was 
detected, in this example the hit occurred in Way 1. 
The data would be retrieved from Way 1 of the 
cache memory. If the next access was to address 
OAAA4007h (SET =001, TAG=OAAA4h), the com- 
parison would be done and a TAG match would be 
found in Way 2. However in this case the TAG Valid 
bit is cleared so the access is a miss and the data 
will be retrieved from main memory. The cache 
memory will also be updated. It is helpful to notice 
that the main memory is broken into pages by the 
TAG size. In this case with a 20-bit TAG address 
there are 229 pages. The smaller the TAG size the 
fewer pages main memory is broken into. The SET 
breaks down these memory pages. The larger the 
SET size the more lines per page. 


The following is a description of the interaction be- 
tween the 386DX Microprocessor, the 82395DXs 
cache and Cache Directory. 


2.3.1 CACHE READ HITS 
When the 386 DX Microprocessor initiates a memo- 


ry read cycle, the 82395DX uses the 8 bit SET ad- 
dress to select 1 of the 256 SET’s in the Cache 


Directory. The four TAG’s of this SET are simulta- 
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neously compared with address bits A12-A31. The 
four TAG Valid bits are checked. If any comparison 
produces a hit the corresponding bank of internal 
SRAM supplies the 32 bits of data to the 386 DX 
Microprocessor data bus based on the DW Select 
bits A2 and A3. The LRU bits are then updated ac- 
cording to the Pseudo LRU algorithm. | 


2.3.2 CACHE READ MISSES 


Like’ the cache read hit the 82395DX uses the 8 bit 
SET address to select the 4 TAG’s for comparison. 
If none of these match or if the TAG Valid bit associ- 


~ ated with a matching TAG address is cleared the 


cycle is a miss and the 82395DxX retrieves the re- 


quested data from main memory. A Line Fill is simul- 


taneously started to read the line of data from sys- 
tem memory and write the line of data into the cache 
in the Way designated by the LRU bits. 


2.3.3 OTHER OPERATIONS THAT AFFECT THE 
CACHE AND CACHE DIRECTORY 


Other operations that affect the cache and Cache 
Directory include write hits, snoop hits, cache — 
FLUSH #es and 82395DX RESETSs. In write hits, the 
cache is updated along with main memory. The bank 
that detected the hit is the one that data is written to. 


The LRU bits are then adjusted according to the 


Pseudo LRU algorithm. When a cache invalidation 
cycle occurs (Snoop hit) the tag valid bit is cleared. 
RESETs and cache FLUSH¥#es clear all the TAG 
Valid bits. 


2.4 Concurrent Line Buffer Cacheing 


This feature of the 82395DX can be broken into two 
components, Concurrent Line Buffer and Line Buffer 
Cacheing. 


A Concuurent Line Buffer indicates that the DW re- 
quested is returned to the 386 DX Microprocessor in 
the first cycle of a Line Fill. The Local Bus is then 
free to execute other cycles while the Line Fill is 
being completed on the System Bus. 


Line Buffer Cacheing indicates that the 82395DX 
serves 386 DX Microprocessor cycles before it up- 
dates its Cache Directory. If the 386 DX Microproc- 
essor cycle is to a line which resides in the cache 
memory, the 82395DxX will serve that cycle as a reg- 
ular cache hit cycle. The cache memory and cache 
directory are not updated until after the Line Fill is 
complete (see sections 2.8 and 2.9). The 82395DX 
keeps the address and data of the retrieved line in 
an internal buffer, the System Bus line buffer. Any 
386 DX Microprocessor read cycle to the same line 
will be serviced from the line buffer. Until the cache 
memory and cache directory are updated, any 
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386 DX Microprocessor read cycle to a Double- 
word, which has already been retrieved, will be serv- 
iced from the System Bus line buffer. On the other 
hand, any 386 DX Microprocessor write cycle to the 
same line will be done to the cache memory after 
updating the line in the cache. In this case, the write 
cycle is buffered and the READYO # is activated af- 
ter updating the line in the cache. However, if the 
line is Write Protected, the write cycle will be han- 
dled as if it is a miss cycle. 


- A snooping cycle to a line which has not been updat- 
ed in the cache will invalidate the SB Line Buffer and 
will prevent the cache update. Also, cache FLUSH 
will invalidate the buffer. More details about invalida- 
tion cycles can be found in chapter 6. 


2.5 Cache Control! 


The cache can be controlled via the SWP # pin. By 
asserting this pin during the first DW in a Line Fill the 
82395DxX sets the write protect bit in the Cache Di- 
rectory making the entry protected from writes. 


2.6 Cache Invalidation 


Cache invalidation cycles are activated using the 
SEADS# pin. SAHOLD or SHLDA asserted condi- 
tions the 82395DX’s system address bus (SA4- 
SA31) to accept an input. The 82395DX floats its 
system address bus in the clock immediately after 
SAHOLD was asserted, or in the clock SHLDA is 
activated. No address hold acknowledge is required 
for SAHOLD. SEADS # asserted and the rising edge 
of CLK2 indicate that the address on the System 
Bus is valid. SEADS# is not conditioned by 
SAHOLD or SHLDA being asserted. The 82395DX 
will read the address and perform an internal cache 
_ invalidation cycle to the address indicated. The inter- 
nal cache invalidation cycle is serviced 1 cycle after 
SEADS# was sampled active (or 2 cycles after 
SEADS# was sampled active if there is contention 
between the Cache Directory Snoop (CDS) cycle 
and a Cache Directory Lookup (CDL) cycle, see 2.8 
and Figure 2.6). To actually invalidate the address 
the 82395DxX clears the tag valid bit. 


2.7 Cache Flushing 


The user has an option of clearing the cache by acti- 
vating the FLUSH# input. When sampling the 
FLUSH # input low for four clocks, the 82395DxX re- 
sets all the tag valid bits and the LRU bits of the 
Cache Directory. Thus, all the banks of the cache 
are invalidated. Also, the SB Line Buffer is invalidat- 
ed. The FLUSH # input must have at least eight CLK 
periods in order to be recognized. If FLUSH is acti- 
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vated for longer than four CLKs, the 82395DxX will 
handle all accesses as misses and it will not update 
the Cache Directory (the Cache Directory will be 
FLUSH#ed as long as the FLUSH # input is low). 
The cache is also FLUSH #ed during RESET. 


2.8 Cache Directory Accesses and 
Arbitration 


There are five types of accesses to the cache direc- 
tory. Each access is a one clock cycle: 


1) Cache Directory Look-Up 
2) Cache Directory Update 
3) Cache Directory Snoop 
4) Testability Accesses 

5) Cache Directory FLUSH 4 


A description of each of these accesses follows: 


1) Cache Directory Look-up cycle (CDL): A 386 
DX Microprocessor access in which the hit/miss 
decision is made. The Cache Directory is ac- 
cessed by the 386 DX Microprocessor address 
bus directly from the pins. CDL is executed when- 
ever ADS# is activated, in both read and write 
cycles. The LRU bits are updated in every CDL hit 
cycle so the accessed “‘Way” becomes the most 
recently used. The LRU bits are read in every 
CDL miss cycle to indicate the “Way” to be up- 
dated in the Cache Directory Update cycle. Also, 
the WP bit is read. 


2) Cache Directory Update cycle (CDU): A write 
cycle to the cache directory due to a previous 
miss. The CDU cycle can be caused by a TAG 
mismatch (either a Tag Address mismatch or a 
cleared TAG Valid bit). In both cases, the new 
TAG is written to the “Way” indicated by the LRU 
bits read by the previous CDL miss cycle. Also, 
the TAG Valid bit is turned on and the LRU algo- 
rithm is updated so the accessed “Way” be- 
comes the most recently used. The WP bit is writ- 
ten according to the sampled SWP# input. The 
Cache Directory is accessed by the internally 
latched 386 DX Microprocessor address bus. 
Simultaneously with the CDU cycle, the cache 
memory is updated. 


3) Cache Directory Snooping cycle (CDS): A 
Cache Directory look-up cycle initiated by the 
System Bus, in response to an access to a mem- 
ory location that is shared with another system 
master, followed by a conditional invalidation of 
the TAG Valid bit. If the look-up cycle results in a 
hit, the corresponding TAG Valid bit in the Way 
which detected the HIT will be cleared. CDS cy- 
cles do not affect the LRU bits. The Cache Direc- 
tory is accessed by the internally latched System 
Bus address. 
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4) Testability accesses (CDT): Cache Directory 

_ read and write cycles performed in SRAM test 

mode. During the TEST accesses, 25 bits of each 

entry (20 for the TAG, one for the TAG Valid BIT, 

one for the WP bit and 3 for the LRU bits) are 

_ read or written. No comparison is done. CDT cy- 

cles are used for debugging purposes so CDT cy- 
cles do not contend with other:cycles. 


5) Cache Directory FLUSH cycle (CDF): During 
RESET or as a result of a FLUSH# request gen- 
erated by activating the FLUSH# input, all the 
TAG Valid bits and the LRU bits are cleared as 
well as the Line Buffer. CDF is a one clock cycle if 
FLUSH # is active for four clocks. If FLUSH# is 

- activated longer, the CDF cycle is N—3 clocks, 
where N is the number of clocks FLUSH # is acti- 
vated for. The actual clearing of the valid bits oc- 
curs seven clocks after the activation of 
FLUSH #. Two clocks are for internal synchroni- 
zation and four for recognizing FLUSH# assert- 
ed. It has higher priority than all other cycles. CDF 
cycle may occur simultaneously with any other cy- 


cle but the result is always a FLUSH#ed Cache 


Directory. — 


The 82395DX performs the CDL cycle in T1 state. 
The CDU cycle, in general, is performed in the clock 
after the last SRDY# or SBRDY# of the Line Fill 
cycle and the CDS cycle one clock after sampling 
_the SEADS# active (see more details on snooping 
cycles in chapter 6). Supporting concurrent activities 
on local and system busses causes CDL cycles to 
be requested in any clock during the execution with 
a maximum rate of a CDL cycle every other clock. 


The following arbitration mechanism guarantees res- 
olution of any possible contention between CDL, 
CDU and CDS cycles: | 


ACCESS 


__X geen 
CACHE DIR ~={ CDL | . 
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1. The priority order is CDL, CDS and CDU. CDL has. 
the highest priority, CDU has the lowest. 


2. In case of simultaneous CDL and CDS cycles, the 
CDS will be delayed by one clock. So, the maxi- 

_ mum latency in executing the invalidation cycle is 
two clocks after sampling the SEADS# active. 
Since the maximum rate of each of the CDL and 
the CDS cycles is one every other clock, the 
82395DxX is able to interpose the CDL and CDS 
cycles such that both are serviced. Figure 2.6 
Clarifies the interposing in the Cache Directory be- 
tween the 386 DX Microprocessor and the Sys- 
tem Bus. . 


3. CDU cycle is executed in any clock after the last 
SRDY# or SBRDY# in which neither CDL nor 
CDS cycles are requested. The worst case is the 
case where immediately after the read miss, the 
386 DX Microprocessor runs consecutive read 
hits while the System Bus is running invalidation 
cycles every other clock. In this case, the CDU 
cycle is postponed until a free clock is inserted, 
which may occur due to slower look-up rate (in 
case of read miss, non-cacheable read, etc...), or 
due to slower SEADS # rate. 


Since every CDU cycle is synchronized with the 
cache update (CU - writing the retrieved line into 
the cache), a possible contention on the cache 
can occur between a cache update cycle and a 
cache write cycle (CW - cache is written due to a 
write hit cycle). In this case, the CW cycle is exe- 
cuted; and the CDU and CU cycles are delayed. 
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Figure 2.6 - Interposing in the Cache Directory 
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2.9 Cache Memory Description 


The 82395DX cache memory is constructed of four 
banks, each bank is 1K double words (4KB) and rep- 
resents a ‘‘Way’’. For example, if the read cycle is to 
Way 0, bank 0 will be read. The basic cache element 
is a Line. The cache is able to write a full line or any 
byte within the line. Reads are done by DW only. 


There are four types of accesses to the cache data 
memory. Each access is a one clock cycle: 


1) Cache Read cycle 
2) Cache Write cycle 
3) Cache Update cycle. 
4) Testability Access 


A description of each type of access follows: 


1) Cache Read cycle (CR): CR cycle occurs simul- 
taneously with Cache Directory look-up (CDL) cy- 
cle if the cycle is a read. In case of a hit, the 
cache bank in which the hit was detected is read. 
In CR cycle, the A2-3 address lines select the 
requested DW within the line. 


2) Cache Write cycle (CW): CW cycle occurs one 
clock after the Cache Directory look-up cycle 
(CDL) if the cycle is a write hit and the WP bit is 
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not set. The cache bank in which the hit was de- 
tected is updated. In CW cycle, the A2-3 address 
lines and the four BE# lines select the required 
bytes within the line to be written. For all write hit 
cycles, READYO# is returned simultaneously 
with the CW cycle unless the write buffer is full. 
When the write buffer is full the first cycle buff- 
ered must be completed on the system bus be- 
fore READYO# can be asserted. 


3) Cache Update cycle (CU): CU cycle occurs 
simultaneously with every Cache Directory update 
cycle (CDU). The full line is written. 


4) Testability accesses (CT): cache read and write 
cycles performed by the 82395DX TEST ma- 
chine. During the TEST accesses, the cache 
memory acts as a standard RAM. CT cycles are 
used for debugging purposes so CT cycles do not 
contend with other cycles. 


The Cache Directory arbitration rules guarantee that 
contention will not occur in the cache accesses. 
This is since CR is synchronized with the CDL cycle, 
CU is synchronized with CDU cycle, CW cannot oc- 
cur simultaneously with CR cycles (ADS# not acti- 
vated while READYO # is returned since 386 DX Mi- 
croprocessor is not pipelined) and finally the possi- 
ble contention of CW and CU is resolved. See figure 
2.7 for an example of Cache Directory and cache 
memory accesses during a typical cycle execution. 
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Figure 2.7 - Cache Directory and Cache Accesses 
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3.0 PIN DESCRIPTION 


The 82395DX pins may be divided into 4 groups: 
1. Local Bus interface pins 

2. System Bus interface pins 

3. Local Bus decode pins 

4. System Bus decode pins 


Some notes regarding these groups of pins follow: 


1. All Pins - All input and I/O pins (when used as 
inputs) must be synchronous to CLK2, to guaran- 
tee proper operation. Exceptions are the RESET 
pin, where only the falling edge needs to be syn- 
chronous to CLK2, and A2Z0M# and FLUSH ¥# pin, 
which are asynchronous. 


2. Local Bus Interface Pins - All Local Bus interface 
pins that have a corresponding 386DX Microproc- 
essor signal (A2-31, W/R#, D/C#, M/IlO#, 
LOCK #, and DO-31) must be connected directly 
to the corresponding 386 DX Microprocessor 
pins. 

3. System Bus Interface Pins - In multi-82395DX 
mode, all System Bus output and I/O pins are 
driven by the primary 82395DX, with the excep- 
tion of SADS #. See chapter 4 for more details. 


4. Local / System Bus Decode Pins - These signals 
are generated by proper decoding of the Local 
and System Bus addresses. The decoding for the 
Local Bus decode pins, LBA# and NPI#, must be 
static. The decoding for the System Bus decode 
pins, SKEN# and SWP#, must. be static over the 
line boundary. They must not change during a 
Line Fill. If a change in the decoding of these sig- 
nals is made, the 82395DX must be FLUSH #ed 
or RESET. 


3.1 Local Bus Interface Pins 


3.1.1 386 DX MICROPROCESSOR/82395DX 
CLOCK (CLK2 1) 


This signal provides the fundamental timing for the 
82395DX. The 82395Dx, like the 386 DX Microproc- 
essor, divides CLK2 by two to generate the internal 
clock. The phase of the internal 82395DX clock is 
synchronized to the internal CPU clock phase by the 
RESET signal. All external timing parameters are 
specified with respect to CLK2. 


3.1.2 LOCAL ADDRESS BUS 


3.1.2.1 Local Bus Address Lines (A2-A31 1) 


These signals, along with the byte enable signals, 
define the physical area of memory or |/O accessed. 
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3.1.2.2 Local Bus Byte Enables (BE3# —-BE0O# I) 


These pins are used to determine which bytes are 
accessed in partial write cycles. On read-hit cycles 
these lines are ignored by the 82395DX. On write hit 
cycles they determine which bytes in the internal 
Cache SRAM must be updated, and passed to the 
System Bus along with the System Bus write cycle. 
In all system bus cycles (non-cacheable reads, read 
misses and all writes) these signals are mirrored by 
the SBEO-3# signals. These signals are active 
LOW. 


3.1.3 LOCAL BUS CYCLE DEFINITION 


3.1.3.1 Local Bus Cycle Definition Signals 
(W/R#,D/C#,M/IO # I) 


The memory/input-output, data/code, write/read 
lines are the primary bus definition signals directly 
connected to the 386 DX Microprocessor. These 
signals become valid as the ADS # signal is sampled 
asserted. The bus cycle type encoding is identical to 
that of the 386 DX Microprocessor. The 386 DX Mi- 
croprocessor encoding is shown in table 5.1. The 
bus definition signals are not driven by the 386 DX 
Microprocessor during bus hold and follow the tim- 
ing of the address bus. | 


3.1.3.2 Local Bus Lock (LOCK # 1) 


This signal indicates a LOCK #ed cycle. LOCK #ed 
cycles are treated as non-cacheable cycles, except 
that LOCK #ed write hit cycles update the cache as 
well. LOCK # ed write cycles are not buffered. 


The 82395DX asserts SLOCK# when the first 
LOCK#ed cycle is initiated on the System Bus. 
SLOCK# is deactivated only after all LOCK#ed 
System Bus cycles were executed, and LOCK # was 
deactivated. 


3.1.4 LOCAL BUS CONTROL 


3.1.4.1 Address Status (ADS# f) 


The address status pin, an output of the 386 DX 
Microprocessor, indicates that new, valid address 
and cycle definition information is currently available 
on the Local Bus. The signals that are vale when 
ADS # is activated are: 


A(2-31), BE(0-3) #, W/R#, D/C#, M/IO#, BOERS 
NPI# and LBA# 
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3.1.4.2 Local Bus Ready (READYI# 1) 


This is the ready input signal seen by the local bus 
master. Typically it is a logical OR between the 
82395DX generated READYO# signal and other 
(optional) READY # signals generated by other Lo- 
cal Bus masters. It is used by the 82395DX, along 
with the ADS# signal, to keep track of the 386 DX 
Microprocessor bus state. 


3.1.4.3 Local Bus Ready Output (READYO# 1/0) 


This output is returned to: the 386 DX Microproces- 
sor to terminate all types of 386 DX Microprocessor 
bus cycles, except for Local Bus cycles. This signal 
is wire-ORed with parallel 82395DX READYO# sig- 
nals (if more than one 82395DX is used on a 386 DX 
_Microprocessor bus). For more details on READ- 
YO# functionality in a multi-82395DX system, refer 
to Chapter 4. 


The READYO# may serve as READY *# signal for 
the 387 DX Math ee cal For details, refer to 
a a 5: 


This pin is used during the self configuration se- 
quence, after RESET. For details, refer to Chapter 4. 


3.1.5 RESET (RESET 1) 


This signal forces the 82395DX to begin execution 
at a known state. RESET falling edge is used by the 
82395DxX to set the phase of its internal clock identi- 
cal to the 386 DX Microprocessor internal clock. The 
RESET falling edge must satisfy the appropriate set- 
up and hold times for proper chip operation. RESET 
must remain active for at least 1ms after power sup- 
ply and CLK2 input have reached their proper DC 
and AC specifications. 


The RESET input is used for three purposes: first, it 
RESETs the 82395DX and brings it to a known 
state. Second, it is used to synchronize the internal 
82395DX clock phase to that of the 386 DX Micro- 
processor. Third, it initiates a self-configuration se- 
quence in which the 82395DX determines the num- 
ber of parallel 82395DX devices in the system and 
it’s own configuration (Primary / Secondary and ad- 
dress space). 


On power up, RESET must be active for at least 1 


millisecond after power has stabilized to a voltage 


within spec, and after CLK2 input has stabilized to 
voltage and frequency within spec. This is to allow 
the internal circuitry to stabilize. Otherwise, RESET 
must be active for at least 10 clock cycles. 


No access to the 82395DxX is allowed for 128 clock 
cycles after the RESET falling edge. During RESET, 
all other input pins are ignored, except SHOLD, 
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SAHOLD and SFHOLD#. Unlike the 386 DX Micro- 
processor, the 82395DX can respond to a System 
Bus HOLD request by floating its bus and asserting 
SHLDA even while RESET is asserted. Also the 
82395DX can respond to a System Bus address 
HOLD request by floating its address bus. The 
status of the 82395DX outputs during RESET is 
shown in Table 3.2. 


The 82395DX samples the LBA#pin during RESET 
and enables the decoding of Weitek 3167 Floating- 
Point Coprocessor address space if it is sampled 
low (active). 


The user must make sure SAHOLD and FLUSH # 
are not asserted at the falling edge of RESET. If 
they are the Tristate Test Mode will be entered. 
The user must also insure that FLUSH # does not 
get asserted for one clock cycle while SAHOLD 
is negated for the same CLK cycle prior to 
RESET falling. If this condition exists a reserved 
mode will be entered. 


3.1.6 CONFIGURATION (CONF # 1) 

The activity on this input during and after RESET 
allows the 82395DxX to configure itself to operate in 
the specified address range. 

Refer to Chapter 4 for more details. This pin is active 
LOW. . 


3.1.7 LOCAL DATA BUS 


3.1.7.1 Local Bus Data Lines (D0-D31 1/0) 


These are the Local Bus data lines of the 82395DX 
and must be connected to the DO-—D31 signals of 


the Local Bus. 


3.1.8 LOCAL BUS DECODE PINS 


These signals are generated by proper decoding of 
the local bus address. The decoding of these signals 
must be static, the decoding must not change during 
normal operation of the 82395DxX. If a change in the 
decoding of these signals is made, the 82395DX 
must be FLUSH “ed or RESET. These signals must 
be stable throughout the local bus cer (refer to 
Figure 5.1). 


3.1.8.1 Local Bus Access Indication (LBA# 1) 


This signal instructs the 82395DxX that the cycle cur- _ 
rently in progress is targeted to a Local Bus device, 
and must therefore be ignored by the 82395DX. The 
387 DX Math Coprocessor is considered a Local 
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Bus Device, but LBA# need not be generated for 
387 DX Math Coprocessor accesses. Weitek 3167 
Floating-Point Coprocessor address space may also 
be decoded internally as Local Bus cycles. Note that 
LBA # has priority over all other types of cycles. This 
signal is active LOW. 


82395DX 


ADVANCE INFORMATION 


3.1.8.2 No Post Input (NPI¥# 1) 


This signal instructs the 82395DX that the write cy- 
cle currently in progress must not be posted (buff- 
ered) in the write buffer. NPIi# is sampled on the 
falling edge of CLK following the address change, 
see figure 5.1. NPI# is ignored during read cycles. 
This signal is active LOW. 


|-— T=State 


| 


T=State | 
290382-11 


Figure 3.1 - CLK2 and Internal Clock 


Internal Clock 


internal Clock 


290382-12 


290382-13 


Figure 3.3 - Sampling LBA# During RESET 
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3.1.9 ADDRESS MASK 


3.1.9.1 Address Bit 20 Mask (A20M# 1) 


This pin, when active (low), forces the A20 input as 


seen by the 82395DxX to logic ’0’, regardless of the 
actual value on the A20-input pin. It must be assert- 
ed two clock cycles before ADS# for proper opera- 
tion. A2ZOM# emulates the address wraparound at 1 


MByte which occurs on the 8086. This pin is asyn-. 


chronous but must meet setup and hold times to 


guarantee recognition in a specific clock. It must be — 


stable throughout Local Bus memory cycles. 


3.2 System Bus Interface Pins 
3.2.1 SYSTEM ADDRESS BUS 


3.2.1.1 System Bus Address Lines 
(SA2-SA31 1/0 *) 


* SA2-3 are outputs only. 


These are the SYSTEM BUS address lines of the 
82395DX. When driven by the 82395Dx, these sig- 
nals, along with the System Bus byte enables define 
the physical area of memory or I/O accessed. 


Activation of SEADS# conditions these signals to 
serve as inputs for the snooping cycle. 


3.2.1.2 System Bus Byte Enables | 
(SBO # -SB3# O) 


‘These are the byte enable signals for the System 
Bus. The 82395DX drives these pins identically to 
BEO #-—BE3# in all System Bus cycles except Line 
Fills. In Line Fills these signals are driven identically 
to BEO#-—BE3# in the first read cycle of the Line 
Fill. They are all driven active in the remaining cycles 
of the Line Fill. 


The system memory must ignore these pins during 
Line Fill, and return all four bytes. These signals are 
active low. 


3.2.2 SYSTEM BUS CYCLE DEFINITION 


3.2.2.1 System Bus Cycle Definition 
(SW/R#,SD/C#,SM/IO# O) 


These are the System Bus cycle definition pins. 
When the 82395DX is the SYSTEM BUS master, it 
drives these signals identically to the 386 DX Micro- 
processor encoding. 
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3.2.2.2 System Bus Lock (SLOCK # 0) 


The SYSTEM BUS LOCK pin is one of the bus cycle 
definition pins. It indicates that the current bus cycle 
is LOCK#ed: that the 82395DX (on behalf of the 
CPU) must be allowed exclusive access to the Sys- 
tem Bus across bus cycle boundaries until this signal 
is de-asserted. The 82395DX does not acknowledge 
a bus hold request when this signal is asserted. The 
82395DX asserts SLOCK# when the first LOCK # ed 
cycle is initiated on the System Bus; SLOCK # is de- 
activated only after all LOCK #ed System Bus cycles 
were executed, and LOCK# was deactivated. 
SLOCK # is active LOW. 


3.2.3 SYSTEM BUS CONTROL 


3.2.3.1 System Bus Address Status (SADS# O) 


The address status pin is used to indicate that new, 
valid address and cycle definition information is cur-_ 


rently being driven onto the address, byte enables 


and cycle definition lines of the System Bus. SADS # 
can be used as an indication of a new cycle start. 
SADS# is driven active in the same clock as the 
addresses are driven. SADS# is not valid until a 
specified setup time before the CLK falling edge, 
and must be sampled by CLK falling edge before it is 
used by the system. This signal is active LOW. 


3.2.3.2 System Bus Ready (SRDY # 1) 


The SRDY# signal indicates that the current bus 
cycle is complete. When SRDY # is sampled assert- 
ed it indicates that the external system has present- 
ed valid data on the data pins in response to a read 
cycle or that the external system has accepted the 
82395DX data in response to a write request. This 
signal is ignored when the SYSTEM BUS is at STi, 
STH, ST1 or ST1P states. | 
At the first read cycle of a Line Fill, if SBRDY# is 
returned active and both SRDY # and SNA# are re- 
turned inactive, a burst Line Fill will be executed. If 
SRDY# is returned active and SNA# is returned 
inactive, a non-burst non-pipelined Line Fill will be 
executed. If SNA# is returned active and SRDY # is 
inactive, a non-burst pipelined wie fill will be execut- 
ed. 


Once a burst Line Fill has started, if SRDY# is re- 
turned in the second or third DW of the transfer, the 
burst Line Fill will be interrupted and the cache will 
not be updated. The first DW will already have been 
transferred to the CPU. Note that in the last (fourth) 
bus cycle in a Line Fill, SBRDY # and SRDY# have 
the same effect on the 82395DX. They indicate the 
end of the Line Fill. This signal is active LOW. 
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3.2.3.3 System Bus Next Address (SNA# I) 


This input, when active, indicates that a pipelined 
address cycle can be executed. It is sampled by the 
82395DX in the same timing as the 386 DX Micro- 
processor samples NA#. If this signal is sampled 
active, then SBRDY# is treated as SRDY#, i.e. 
burst Line Fill is disabled. This signal is ignored once 
a burst Line Fill has started, as well as during the 
fourth DW of a Line Fill. 


3.2.4 BUS ARBITRATION 


3.2.4.1 System Bus Request (SBREQ O) 


SBREQ is the internal cycle pending signal. This in- 
dicates to the. outside world that internally the 
82395DX has generated a bus request (due to a 
CPU’s request that requires access to the System 
Bus). It is generated whether the 82395DX owns the 
bus or not and can be used to arbitrate among the 
various masters on the system bus. In read misses, 
if the bus is available and the cycle starts immediate- 


ly, this signal will not be activated at all. This signal is 


active HIGH. 


3.2.4.2 System Bus Hold Request (SHOLD 1!) 


This signal allows another bus master complete con- 
trol of the entire System Bus. In response to this pin, 
the 82395DxX floats all its system bus interface out- 
put and input/output pins (With the exception of 
SHLDA and SBREQ) and asserts SHLDA after com- 
pleting its current bus cycle or sequence of 
LOCK #ed cycles. The 82395DX maintains its bus in 
this state until SHOLD is deasserted. SHOLD is ac- 
tive HIGH. SHOLD is recognized during reset. 


3.2.4.3 System Bus Hold Acknowledge 
(SHLDA O) 


This signal goes active in response to a hold request 
presented on the SHOLD pin and indicates that the 
82395DX has given the bus to another System Bus 
master. It is driven active in the same clock that the 
82395DxX floats its bus. When leaving a bus hold, 
SHLDA is driven inactive in one clock and the 
82395DX resumes driving the bus. Depending on in- 
ternal requests the 82395DX may, or may not begin 
a System Bus cycle in the clock where SHLDA is 
driven inactive. The 82395DxX is able to support CPU 
Local Bus activities during System Bus hold, since 
the internal cache is able to satisfy the majority of 
those requests. This signal is active HIGH. 
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3.2.4.4 System Bus Fast Hold Request 
(SFHOLD # I) 


This input allows another bus master immediate ac- 
cess to the System Bus. In response to this signal, 
the 82395DX stops driving the System Bus output 
and input/output pins (with the exception of SHLDA 
and SBREQ) in the next CLK cycle. Note that the 
same signals are tristated in response to a SHOLD 
request. Because the 82395DX always stops driving 
the System Bus in response to SFHOLD # active, no 
acknowledge is needed. 


The bus remains in the high impedance state until 
SFHOLD # is negated. 


Note that SRDY# is internally inactivated during 
SFHOLD# cycles. The only affect of SFHOLD# be- 
ing asserted is forcing the System Bus output and 
|/O buffers into their high impedance state. It is the 
responsibility of the system designer to guarantee 
that bus cycles which are in progress when 
SFHOLD# is asserted are terminated correctly. 


This pin is recognized during RESET and is active 
low. 


3.2.5 BURST CONTROL 


3.2.5.1 System Bus Burst Ready (SBRDY # 1) 


This signal performs the same function during a 
burst cycle that SRDY# does in a non-burst cycle. 
SBRDY # asserted indicates that the external sys- 
tem has presenied valid data on the data pins in 
response to a burst Line Fill cycle. This signal is ig- 
nored when the SYSTEM BUS is at STi, STH, ST1 or 
ST1P states. | 


Note that in the last (fourth) bus cycle in a Line Fill, 
SBRDY # and SRDY # have the same effect on the 
82395DxX. They indicate the end of the Line Fill. For 
all cycles that cannot run in burst, e.g. noncacheable 
cycles, non Line Fill cycles (or pipelined Line Fill), 
SBRDY # has the same effect on the 82395DX as 
the normal SRDY # pin. This signal is active LOW. 


3.2.5.2 System Bus Burst Last Cycle Indicator 
(SBLAST# O) 


The system burst last cycle signal indicates that the 
next time SBRDY # is returned the burst transfer is 
complete. In other words, it indicates to the external 
system that the next SBRDY # returned is treated as 
a normai SRDY # by the 82395DxX, i.e., another set 
of addresses will be driven with SADS# or the Sys-- 
tem Bus will go idle. SBLAST# is normally active. 
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- In acache read miss cycle, which may proceed as a 
Line Fill, SBLAST# starts active and later follows 
SKEN# by one clock. SBLAST# is active during 
non-burst Line Fill cycles. Refer to Chapter 6 for 
more details. This signal is active LOW. 


3.2.6 CACHE INVALIDATION 


3.2.6.1 System Bus Address Hold (SAHOLD I) - 


This is the Address Hold request. It allows another 
bus master access to the address bus of the 
82395DxX in order to indicate the address of an ex- 
ternal cycle for performing an internal Cache Direc- 
tory lookup and invalidation cycle. In response to 
this signal, the 82395DX immediately (in the next cy- 
cle) stops driving the entire. system address bus 
(SA2-SA31). Because the 82395DX always stops 
driving the address bus, in response to system bus 
address hold request, no hold acknowledge is re- 
quired. Only the address bus will be floated during 
address hold, other signals can remain active. For 
example, data can be returned for a previously spec- 
ified bus cycle during address hold. The 82395DX 
does not initiate another bus cycle during address 
hold. 


This pin is recognized during RESET. However, 
since the entire cache is invalidated by reset, any 
invalidation cycles run will be SUPerIOUS: This sig- 
nal is active high. 


3.2.6.2 System Bus External Address Strobe 
(SEADS # 1) 


This signal indicates that a valid external address 
has been driven onto the 82395DxX pins and that this 
address must be used to perform an internal cache 
invalidation cycle. Maximum allowed invalidation cy- 
cle rate is one every two clock cycles. This signal is 
active low. | 


3.2.7 CACHE CONTROL | 


3.2.7.1 Flush (FLUSH # 1) 


This pin, when sampled active for four clock cycles 
or more, causes the 82395DxX to invalidate its entire 
Tag Array. In addition, it is used to configure the 
82395DX to enter various test modes. For details 
refer to Chapter 7. This pin is asynchronous but 
must meet setup and hold times to guarantee recog- 
nition in any specific clock. This signal is active 
LOW. 
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3.2.8 SYSTEM DATA BUS 


ADVANCE INFORMATION 


3.2.8.1 System Bus Data Lines (SD0-SD31 1/0) 


These are the System Bus data lines of the 
82395DxX. The lines must be driven with appropriate 
setup and hold times for proper operation. These 
signals are driven ae the 82395DX only during write 
cycles. 


3.2.9 SYSTEM BUS DECODE PINS 


3.2.9.1 System Cacheability Enable (SKEN# 1) 


This is the cache enable pin. It is used to determine 
whether the current cycle running on the System 
Bus is cacheable or not. When the 82395DX gener- 
ates a read cycle that may be cached, this pin is 
sampled 1 CLK before the first SBRDY#, SRDY # 
or SNA# is sampled active (for detailed timing de- 
scription, refer to Chapter 6). If sampled active, the 
cycle will be transformed into a Line Fill. Otherwise, 
the Cache and Cache Directory will be unaffected. 
Note that SKEN # is ignored after the first cycle in a 
Line Fill. SKEN# is ignored during all System Bus 
cycles except for cacheable read miss cycles. This 
signal is active LOW. 


3.2.9.2 System Write Protect indication 
(SWP# 1) 


This is the write protect indicator pin. It is used to 


determine whether the address of the current sys- 


tem bus Line Fill cycle is write protected or not. 


In non-pipelined cycles, the SWP# is sampled with 
the first SRDY# or SBRDY# of a system Line Fill 
cycie. In pipelined cycles, SWP# is sampled at the 
last ST2 stage, or at ST1P; in other words, one clock 
phase after SNA# is sampled active. 


The write protect indicator is sampled together with 
the TAG address of each line in the 82395DX Cache 
Directory. In every cacheable write cycle, the write 
protect indicator is read. If active, the cycle will be a 
Write Protected cycle which is treated like a cache- 
able write miss cycle. It is buffered and it does not 


- update the cache even if the addressed location is 


present in the cache. The signal is active LOW. 


3.2.10 DESIGN AIDES 


3:2.10.1 System Next Near Indication 
(SNENE # O) 


This signal indicates that the current System Bus 
memory cycle is to the same 2048 Byte area as the 
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previous memory cycle. Address lines A11-—A31 of 
the current System Bus memory cycle are identical 
to the address lines A11—A31 of the previous mem- 
ory cycle. 


This signal can be used in an external DRAM system 


to run CAS# only cycles, therefore increasing the 
throughput of the memory system. SNENE # is valid 


3.3 Pinout Summary Tables 


82395DX 


ADVANCE INFORMATION 


for all memory cycles, and indicates that the current 
memory cycle is to the same 2048 Byte area, even if 
there were idle or non-memory bus cycles since the 
last System Bus memory cycle. 


For the first memory cycle after the 82395DX has 
exited the HOLD state, or after SAHOLD was deacti- 
vated, this pin will be inactive. This signal is active 
low. 


Table 3.1 - Input Pins 


7 ; Synchronous/ Active | 
Function 
Asynchronous Level 


CLK2 
RESET 
BEO-3#4 
A2-31 
W/R# 
D/C# 
M/lO# 
LOCK# 
ADS# 
READYI# 
LBA# 
NPI# 
FLUSH# 
A20M# 
CONF # 
SHOLD 
SRDY # 
SNA# 
SBRDY # 
SKEN# 
SWP# 
SAHOLD 
SEADS # 
SFHOLD# 


Clock 

Reset 

Local Bus Byte Enables 
Local Bus Address Lines 
Local Bus Write/Read 
Local Bus Data/Control 


Local Bus LOCK 
Local Bus Address Strobe 

_ Local Bus READY 
Local Bus Access Indication 
No Post Input 
FLUSH the 82395DX Cache 
Address Bit 20 Mask 
Configuration | 
System Bus Hold Request 
System Bus READY _ 


System Bus Burst Ready 


System Bus Address HOLD 


Local Bus Memory/Input-Output 


_ System Bus Next Address Indication 


System Cacheability Indication 
System Write Protect Indication 


System Bus External Address Strobe 
System Bus Fast HOLD Request 


Asynchronous* 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Asynchronous 
Asynchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 
Synchronous 


* The falling edge of RESET needs to be synchronous to CLK2 but the rising edge is asynchronous. 
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Table 3.2 - Output Pins 


SBEO-3# 
SADS # 
~SD/C# 
SM/lO # 
SW/R# 


System Bus Byte Enables 

System Bus Address Strobe (1) 
System Bus Data/Control 

System Bus Memory/ npUECUIpEr 
System Bus Write/Read 

System Bus HOLD Acknowledge 
System Bus Request 

System Bus LOCK | 


SHLDA 
SBREQ 
SLOCK # 
SBLAST # 
SA2-3 System Bus Address 

| (2 lowest order bits) 


SNENE # System Bus Next Near Indication 


NOTES: 


System Bus Burst Last Cycle Indication 


SHLDA/ SFHOLD # 
SHLDA/SFHOLD # 
SHLDA/SFHOLD # 
SHLDA/SFHOLD # 
SHLDA/SFHOLD # 


SHLDA/SFHOLD # 
SHLDA/SFHOLD # 
SHLDA/SAHOLD/ 
SFHOLD # 
SHLDA/SFHOLD # 


1. SADS # is driven active in ST1/ST2P and inactive for one phase in the first ST2/ST1P fOTONINS the activation. SADS # is . 


driven high before it is floated. 
2. Unless SHOLD is asserted 


Table 3.3 - Input-Output Pins | 


DO-31 
SDO0-31 | 
SA4-31 


Local Data Bus (2) 
System Data Bus 
System Bus Address 
(except the 2 lowest order bits) 
Local Bus READY 


(1) Provided SHOLD, SAHOLD, and SFHOLD# are inactive 
(2) Local Data is driven only in TZ. 


READYO# 


4.0 BASIC FUNCTIONAL | 
DESCRIPTION 


The 82395DX has an interface to the 386 DX Micro- 
processor (Local-Bus) and to the System Bus. The 
System Bus‘ interface emulates. the 386 DX Micro- 
processor bus such that the system will view the 
82395DxX as the front end of a 386 DX Microproces- 
sor. Some optional enhancements, like burst sup- 
port, are provided to maximize the performance. | 


When ADS# is sampled active, the 82395DX de- 
codes the 386 DX Microprocessor cycle definition 
signals (M/lIO#, D/C#, W/R# and LOCK#), as 
well as two Local Bus decode signals (LBA# and 
NPI#), to determine how to respond. LBA# indi- 
cates that the current cycle is addressed to a Local 
Bus device; NPI# indicates that the current memory 
write cycle must not be buffered. In addition, the 
82395DxX internally decodes the 386 DX Microproc- 
_ essor accesses to the 387 DX Math Coprocessor / 
Weitek 3167 Floating-Point Coprocessor as Local 
Bus accesses. The result of the address, cycle defi- 
nition and cycle qualification decoding is two catego- 


Always Except READs 
Always Except WRITES 
SHLDA/SAHOLD/ 
SFHOLD # 

See Sec 4.6 


ries of accesses, the Local Bus accesses (LBA# 
active or 387 DX Math Coprocessor / Weitek 3167 
Floating-Point Coprocessor accesses) and 82395DX 
accesses. In 387 DX Math Coprocessor accesses, 
the 82395DX drives the READYO# signal active af- 
ter one wait state, if the READY! # was not sampled 
active. Local Bus accesses are raneted by the 
82395DX. , 


Any 82395DX access can be either to a cacheable 
address or to a non-cacheable address. Non-cache- | 
able addresses are all |/O and system accesses 
with SKEN# returned inactive. Non-cacheable cy- 
cles are all cycles to non-cacheable addresses, 
LOCK #ed read cycles and Halt/Shutdown cycles. 
All other cycles are cacheable. For more details 
about non-cacheable cycles, refer to section 4.2. 
Non-cacheable cycles pass through the cache. They 
are always forwarded to the System Bus. 


Cacheable read cycles can be either hit or miss. 
Cacheable read hit cycles are serviced by the inter- 
nal cache and they don’t require System Bus serv- 
ice. A cacheable read miss cycle generates a series 
of four System Bus read cycles, called a Line Fill. Of 
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the four cycles, the first cycle is for reading the re- 
quested data while all four are for filling the cache 
line. The System Bus has the ability to provide the 
system cacheability. attribute to the 82395DX Line 
Fill request, via the SKEN# input, and the system 
write protection indicator, via the SWP # input. Refer 
to chapter 6 for more information about Line Fill cy- 
cles. 


Cacheable write cycles, as any write cycles, are for- 
warded to the system bus. The write buffer algorithm 
terminates the write cycle on the Local Bus, allowing 
the 386 DX Microprocessor to continue processing 
in O wait states, while the 82395DX executes the 
write cycles on the System Bus. All cacheable write 
hit cycles, except protected writes, update the cache 
in a byte basis i.e. only the selected bytes are updat- 
ed. Cacheable write misses do not update the cache 
(the 82395DX does not allocate on writes). All 
cacheable write cycles, except LOCK#ed writes, 
are buffered (unless NPI# pin is sampled active). 


Cache consistency is provided by the SAHOLD, 
SEADS# mechanism. If any bus master performs a 
memory cycle which disturbs the data consistency, 
the address of this cycle must be provided to the 
82395DX using the SAHOLD, SEADS# mechanism. 
Then, the 82395DX checks if that memory location 
resides in the cache. If it does, the 82395DX invali- 
dates that line in the cache by marking it as invalid in 
the Cache Directory. The 82395DX interposes the 
Cache Directory between the 386DX Microproces- 
sor and the System Bus such that the 386 DX Micro- 
processor is not forced to wait due to snooping and 
none of the snooping cycles are missed due to 386 
DX Microprocessor accesses (see figure 2.6). 
Cacheability is resolved on the system side using 
the SKEN# input. SKEN# is sampled one clock be- 
fore the first SRDY #/SBRDY # in nonpipelined Line 
Fill cycles. In pipelined Line Fill cycles, SKEN# is 
sampled one clock phase before sampling SNA# 
active. SKEN# is always sampled at PHI1. 


Note that the 82395DX does not support pipelining 
of the 386DX Microprocessor Local Bus. The NA# 
input on the 386 DX Microprocessor must be tied to 
Vcc. 


4.1 Cacheable Accesses 


In a cacheable access, the 82395DX performs a 
cache directory look-up cycle. This is to determine if 
‘the requested data exists in the cache and to read 
the write protection bit. In parallel, the 82395DX per- 
forms a cache read cycle if the access Is a read, or 


prepares the cache for a write cycle if the access is 


a write. 
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4.1.1 CACHEABLE READ HIT ACCESSES 


If the Cache Directory look-up for a cacheable read 
access results in a hit (the requested data exists in 
the cache), the 82395DX drives the local data bus 
by the data provided from the internal cache. It also 
drives the 3886DX Microprocessor READY # (by acti- 
vating the 82395DX READYO #), so that the 386 DX 
Microprocessor gets the required data directly from 
the cache without any wait states. 


The 82395DxX is a four Way SET associative cache, 
so only one of the four ways (four banks) is selected 
to supply data to the 386 DX Microprocessor. The 
Way in which the hit occurred will provide the data. 
Also, the replacement algorithm (LRU) is updated 
such that the Way in which the hit occurred is 
marked as the most recently used. 


4.1.2 CACHEABLE READ MISS ACCESSES 


If the Cache Directory look-up results in a miss, the 
82395DxX transfers the request to the System Bus in 
order to read the data from the main memory and for 
updating the cache. A full line is updated in cache 
update cycle. As a result of a cache miss, the 
82395DX performs four System Bus accesses to 
read four DWs from the DRAM, and write the four 
DWs to the cache. This is called a Line Fill cycle. 
The first DW accessed in a Line Fill cycle is for the 
DW which the 386 DX Microprocessor requested 
and the 82395DX provides the data and drives the 
READYO# one clock after it gets the first DW from 
the SB. | 


The 82395DX provides the option of supporting 
burst bus in order to minimize the latency of a line 
fill. Also, the 82395DX provides the SKEN# input, 
which, if inactive, converts a Line Fill cycle to a non- 
cacheable cycle. Write protection is also provided. 
The write protection indicator is stored together with 
the TAG Valid bit and the TAG field of every line in 
the Cache Directory. For more details refer to chap- 
ter 6. 


The 82395DX features Line Buffer cacheing. In a 
Line Fill the data for the four DWs is stored in a 
buffer, the Line Buffer, as it is accumulated. After 
filling the Line Buffer, the 82395DX performs the 
Cache Update and the Cache Directory Update. The 
updated Way is the least recently used Way flagged — 
by the Pseudo LRU algorithm during the Cache Di- 
rectory Lookup cycle, if all the Ways are valid. If 
there is a non-valid Way it will be updated. 


The SRDY# (System Bus READY #) active indi- 
cates the completion of the system bus cycle and 
SBRDY # (System Bus Burst READY #) active indi- 
cates the completion of a burst System Bus cycle. In 
a 386 DXMicroprocessor-like system, the 82395DX 
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drives the 386 DX Microprocessor READY# one 
clock after the first SRDY# and, in a burst system, 
_ one clock after the first SBRDY #. This frees up the 
Local Bus, allowing the 386 DX Microprocessor to 
execute the next instruction, while filling the cache. 


So, during Line Fills, there is no advantage in driving 
the 386 DX Microprocessor into the pipelined mode. 
Therefore, the 82395DX does not drive the 386 
DX Microprocessor’s NA# at all. NA# must be 
tied to VCC. | ; 


4.1.2.1 Burst Bus 


The 82395DxX offers an option to minimize the laten- 
cy in Line Fills. This option is the burst bus. and is 
only applicable to Line Fill cycles. By generation of a 
burst bus compatible DRAM controller, one which 


generates SBRDY# and SBLAST # to take advan-. 


tage of the 82395DxX’s burst feature, the number of 


cycles required for a Line Fill to be completed is © 


significantly reduced. Details of burst Line Fills can 
be found in chapter 6. The burst feature uses the 
i486 Microprocessor burst order to fill the 16 byte 
cache line (see Table 6.1). 


4.1.3 CACHE WRITE ACCESSES 


The 82395DX supports the write buffer policy, which 
means that.main memory is always updated in any 
write cycle. However, the cache is updated only 


when the write cycle hits the cache and the ac- 


cessed address is not write protected. In cache write 
misses, the cache is not updated (allocation | 
writes is not supported). 


The 82395DX has a write buffer of four DWs. Only 
the cacheable write cycles, except LOCKed writes, 
are buffered so, if the write buffer is not full, the 
82395DX buffers the cycle. This means that the 
data, address and cycle definition signals are written 
in one entry of the write buffer and the 82395DX 
drives the READYO # in the first T2 so all the buff- 
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ered write cycles run without wait states. If the write 
buffer is full, the 82395DX delays the READYO# 
until the completion of the execution of the first buff- 
ered write cycle. The execution of the buffered write 
cycles depends on the availability of the System 


_ Bus. In non-buffered write cycles, e.g. |/O writes, the 


386 DX Microprocessor is forced to wait until the 
execution of all the buffered writes and the non-buff- 
ered write (READYO# is driven one clock after the 
SRDY# of the non-buffered write). More details 
about the write buffer can be found in chapter 6, 


In cacheable non-write protected write hit cycles, 
only the appropriate bytes within the line are updat- 
ed. The updated bytes are selected by decoding the 
A2, A3 and the four BE# lines. The LRU is updated 
so that the hit Way is the most recently used, as in 
cache read hit cycles. 


All cacheable writes, whether hits or misses, are ex- 
ecuted on the system bus. The System Bus write 
cycle address, data and cycle definition signals are 
the same as the 386 DX Microprocessor signals. All 
buffered writes run with zero wait states if the write 
buffer is not full. | ; 


4.2 Noncacheable System Bus 
Accesses 


Non-cacheable cycles are any of the following 
82395DxX cycles: | 

1) All 1/O cycles. 

2) All LOCKed read cycles. 

3) Halt/Shutdown cycles. 


4) SRAM mode cycles not addressing the internal 
cache or Tagram. 


All the above cycles are defined as non-cacheable 
by the Local Bus interface controller. In addition, 
Line Fill cycles in which the SKEN# signal was re- 
turned inactive are aborted. They are called Aborted 


Table 4.1 - 386 DX Microprocessor Bus Cycle Definition with Cacheability 


386 DX Microprocessor. 
Cycle Definition Non-cacheable 
[0 | Interrupt Acknowledge | _Nor-cacheable 
pt | Undefined | 
1 | 0 | WoRead | Non-cacheable | — 
[1 [| wowite | Nor-caoheable | No _| 
| 0 | MemoryCodeRead | Cacheable | 
Po | 1 | Hatvshutdown | Noncacheabio | — 
1 | 0 | Memory DataRead | Cacheabio | — 


Line Fills (ALF). 
Writes 
Posted 
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Non-cacheable cycles are never serviced from the 
cache and they don’t update the cache. They are 
always referred to the System Bus. In non-cache- 
able cycles, the 82395DX transfers to the System 
Bus the exact 386 DX Microprocessor bus cycle. All 
non-cacheable write cycles are not buffered. 


Description of LOCKed cycles can be found in chap- 
ter 5. 


4.3 Local and System Bus 
Concurrency 


Concurrency between local and System Busses is 
supported in several cases: 


1. Read hit cycles can run while executing a Line Fill 
on the System Bus. Refer to timing diagram 4.1. 


2. Read hit cycles can run while executing buffered 
write cycles on System Bus. Refer to timing dia- 
gram 4.2. 


3. Write cycles are buffered while the System Bus is 
running other cycles, including other buffered 
writes. They are also buffered when another bus 
master is using the System Bus (e.g. DMA, other 
CPU). Refer to timing diagram 4.3. 


‘4, Read hit cycles can run while another System Bus 
master is using the System Bus. 


cacheable read cache 
miss read hit 


Ti T2 T2 T2 (1T2 Ti 
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The first case is established by providing the data 
which the 386 DX Microprocessor requested first 
and later the 82395DxX continues filling its line while 
it is servicing new cache read hit cycles. The 
82395DX updates its cache and cache directory af- 
ter completing the System Bus Line Fill cycle. Mean- 
while, any 386 DX Microprocessor read cycles will 
be serviced from the cache if they hit the cache. In 
case the 386 DX Microprocessor read cycles are 
consecutive such that the 386 DX Microprocessor is 
requesting a double-word which belongs to the 
same line currently retrieved by the System Bus Line 
Fill cycle and the requested DW was already re- 
trieved, the 82395DX provides the requested DW in 
zero wait states (a Line Buffer hit). If the requested 
DW wasn’t already retrieved, it will be read after 
completing the Line Fill. 


The second and third cases are attained by having 
the Four DW write buffer which is described in chap- 
ter 6. The READYO# signal is driven active after 
latching the write cycle, so all buffered cycles will 
run without wait states. This releases the 386 DX 
Microprocessor to issue a new cycle, which can also 
run without wait states if it does not require system 
bus service. Two examples are in the case of a 
cache read hit cycle, or another buffered write cycle, 
which does not require immediate System Bus serv- 
ice. In the case of a write cycle to the same line 
currently retrieved, the write cycle will wait until the 
Line Fill is complete and then the selected bytes 
within the line are written in the cache. READYO # is 
returned after the cache is written. 


line line 
cache buffer | buffer 
read hit| read hit] read hit 
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The second and third reads are to a different line and are serviced from the cache while the fourth and fifth reads are to 


the same line and are serviced from the line buffer. 


Figure 4.1 - Read Hit Cycles During a Line Fill 
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Whenever the System Bus is released to any bus 
master, the 82395DX activates the snooping func- 
tion. The maximum rate of snooping cycles is a cycle 
every other clock. Although the snooping support re- 
quires accessing the 82395DX cache directory, the 
82395DX is able to interpose the cache directory 


Buffered Write 


11 12 
STi. “STI 


Read Hit 


Figure 4.2 - Cache Read Hit Cycles while Executing a Buffered Write on the System Bus 
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accesses between the 386 DX Microprocessor cy- 
cles and the snooping device such that zero wait 
state cache read hit cycles are supported. All the 


_ snooping cycles are also serviced. This is how the 


fourth case is provided. For more details, refer to 
chapter 6. 


Read Hit 


Read Hit 
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Figure 4.3 - Buffered Write Cycles During a Line Fill 
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(A) - SWP# and SKEN# in Non-Pipelined Cycles 


STIP ST2P STIP ST2P 


co se Beaded 
VILL AL 
cag pee gs re te 


Ns GMM ane =a Lil Mis ISS aay | 


290382- 18 


(B) - Swp# and SKEN # in Pipelined Cycles 
Figure 4.4 - SWP# and SKEN# Timing 
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4.4 Disabling the 82395DX 


Cacheability is resolved by the SKEN# input from 
the system side. In order to disable the cache it is 
recommended to deactivate SKEN# and FLUSH 
the cache. This would cause all memory reads to be 
detected as misses and to be transferred to the Sys- 
tem Bus. In order to disable the write buffer, NPI# 
must be asserted. 


4.5 System Description and Device 
Selection 


The expandability feature provides the following 
three configurations: 


1) 16KB cache with one 82395DxX device. 
2) 32KB cache with two 82395DX devices. 
3) 64KB cache with Four 82395DX devices. 


In multi 82395DX configurations, the total Cache Di- 
rectory and cache is partitioned between the various 
82395DXs. For example, in the second configura- 
tion, the first 82395DX includes the first 16KB cache 
and the first 1K tags while the. second 82395Dx in- 
cludes the second 16KB cache and the second 1K 
tags. Every 82395DX is programmed to handle a 
portion of the cache and the Cache Directory. The 
82395DX selection is based on decoding the ad- 
dress of the cacheable cycle. 


In multi 82395DX system, one device must be pro- 
grammed as the Primary 82395DxX to drive the sys- 
tem bus in System Bus cycles (non-cacheable cy- 


cles, write cycles and also in Line Fills). All other _ 


82395DXs must be programmed as Secondary 
82395DXs. They drive only the SADS# signal in 
Line Fill cycles. All other System Bus signals are 
driven by the Primary 82395DX. System diagram 4.6 
describes the 64KB cache system. In the Local Bus, 
each 82395DX gets the 386 DX Microprocessor ad- 
dress, contro! and data signals. In cacheable reads, 
hits or misses, the selected 82395DX drives the 
READYO# and the local data bus. In all other cy- 
cles, the Primary drives these signals. The 


READYO¥#s of all the 82395DXs are wire-ORed to- : 


gether and they can be logically ORed with the 
READYO#s of local bus devices. An External pull- 
up must exist on the 82395DX READYO # to sustain 
it high. The selected 82395DxX drives the READYO# 
low and keeps it low while the READYI# is not sam- 
pled active. Immediately with sampling the 
READYI# active, the selected 82395DxX drives the 
READYO# high for one phase and floats it in the 


next phase. Therefore, zero wait state cycles are — 


supported. 
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In the System Bus, the Primary 82395DX drives all 
the system bus outputs except SADS #. SADS # is a 
wire-ORed signal which is driven by the Primary 
82395DX in non-cacheable reads and in write cy- 
cles. SADS# is driven by the selected 82395DX 
which requires a Line Fill cycle. A pull-up is required 
to sustain the SADS# high while not driven. 


4.6 Auto Configuration 


The 82395DX configures itself automatically during 
the first ten clocks after the falling edge of RESET. 
Information on the system configuration is passed to 
the 82395DXs through their configuration pin 
(CONF #), by connecting them as follows: 


1. The configuration pin of first 82395DX (primary) 
must be connected to GND. 


2. The configuration pin of second 82395DxX (option- 
al) must be connected to RESET signal. 


3. The configuration pin of third 82395DX (optional) — 
must be connected to READYO # signal. 


4. The configuration pin of fourth 82395DX (option- 
al) must be connected to VCC. 


Auto configuration process works as follows: 


1. If the 82395DX senses the configuration pin low 
during RESET, the device is contiguie as device 
#1 (primary). 


2. Otherwise, if the 82395DX senses the configura- 
tion pin low one clock cycle after reset, the device — 
is configured as device #2, and issues a READ- 
YO# pulse for one clock period. | 


3. Otherwise, if 82395DX senses the configuration 
pin low three clock cycles after RESET is sensed 
low, the device is configured as device #3.. 


4. Otherwise, the device is configured as device #4, 
and issues READYO# pulse for one clock period. 


All the 82395DxXs in the system monitor the number 
of pulses on READYO# during the first 4 clocks af- 
ter RESET, to determine how many BeSDOUAS are 
present. 


1. If no pulse was sensed, there is only one 
82395DX. 


2. If one pulse is sensed, there are two 82395DxXs in 
the system. 


3. If two pulses were sensed, there are 4 82395DXs 
in the system. 
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Figure 4.5 - Self Configuration of Four 82395DXs 
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4.7 Address Mapping 


Table 4.2 shows the cache address coafgurations 


for 16K, 32K, and 64K cache sizes. 


4.8 Multi 82395DX Operation 
Description 


The following is a description of each cycle in a mul- 
ti-82395DX environment: | | 


Local Bus CYCLES: Cycles to any local bus device 


(e.g. 387 DX Math Coprocessor). The Primary . 


82395DX drives the READYO # in 387 DX Math Co- 
processor accesses after one wait state, unless 
READYI# was sampled active one clock earlier. All 
the secondary 82395Dxs are idle. 


CACHEABLE READ HIT: this is the only 82395DX 
cycle which does not require system bus service. In 
this cycle, the selected 82395DX drives the local 
data bus and the READYO # in T2. Also, it updates 
its LRU bits. 


CACHEABLE READ MISS: As soon as the system 
bus is available, the selected 82395DX, which de- 
tected the miss, drives the SADS#. In parallel, the 
Primary 82395DX drives the system bus address 
and control signals. After receiving the first SRDY # 
or SBRDY# and after sampling the SKEN# active, 
_ the selected 82395DX samples the system data and 
~~ one clock later it provides it to the 386 DX Micro- 
processor and drives the READYO# active. Then, it 
continues in filling the line and, after collecting the 
four DWs, it updates its cache and Cache Directory. 


CACHEABLE WRITE HITS: the selected 82395DX 
updates its cache, except for write protected cycles. 
The Primary 82395DX, however, executes the write 


82395DX 
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cycle on the system bus. Notice that both the Pri- 
mary and Secondary 82395DXs have the same write 
buffer and both handle the cycle in the same way, 
but the Primary 82395DX is the one which drives the 
including SADS# = and 
READYO#. All other cycles i.e. cacheable write 
misses and non-cacheable cycles are handled pOMy 
by the Primary 82395DxX. 


4.9 Signal Driving in Multi 82395DX 
Environment 


4.9.1 Local Bus Signals 


In the Local Bus, the data bus and the READYO# 
signals are the only signals driven by more than one 
82395DxX. 


1. READYO#: normally not driven (floated), and 
must be sustained by an external pullup. In cache- 
able reads, the selected 82395DX drives 
READYO# active until READYI# is sensed ac- 

~ tive, than it drives READYO# inactive for one 
clock phase and then floats it. In other cycles, the 
primary 82395DX drives READYO# in the same 
manner. 


2. Data Bus: The selected (or primary) 82395DX 
_ drives the data bus in the T2 state of read cycles, 
which ensures no contention with the 386 DX Mi- 
croprocessor when a write cycle follows a read 
cycle. 


4.9.2 SYSTEM BUS SIGNALS 


In the System Bus, the Primary 82395DxX drives all 
the System Bus signals except SADS#. So, the 
jeopardy of contention exists on the SADS# signal 


Table 4.2 - Address Mapping for 1-4 82395DX Systems 


Total 
Devices 
in System 


Address 
Decoding 


Al2# 
Al2 


A13#*A12# 

A13#*A12 
A13 *A12# 
A13 *A12 


Primary/ 
Secondary 


Cache Data 
Mapping — 
OKB-16KB 


Cache Directory 
SETs 


0-255 


0-255 
256-511 


~OKB-16KB 
16KB-32KB 


OKB-16KB — 
16KB-32KB 
32KB-48KB 
48KB-64KB — 


0-255 
256-511 
512-767 

768-1023 
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only. SADS# is normally not driven (floated), and 
must be sustained by an external pullup. Every 
82395DX, Primary or Secondary, after driving the 
SADS # active in ST1 or ST2P, will drive it inactive 
for one clock phase in ST2 or ST1P, and float it 
afterwards. 


In Line Fills, the SADS# is driven by the selected 
82395DX which detected the miss. In all other cy- 
cles e.g. write cycles, the SADS# is driven by the 
Primary 82395Dx. 


4.10 SHOLD/SHLDA/SBREQ 
Arbitration Mechanism 


The Primary 82395DxX is responsible for handling the 
SHOLD/SHLDA/SBREQ mechanism. Assuming 
that the SHOLD is acknowledged, the Primary 


82395DX floats all its outputs immediately after 


completing the system bus cycle in which SHOLD 
was activated and it drives SHLDA active. This en- 
ables the bus master to get control of the bus. When 
the bus master completes its cycles, it drives the 
SHOLD signal inactive. Then the Primary 82395DX 
gets the bus back by driving the SHLDA inactive. 


The Secondary 82395DXs get the SHOLD input in 


order to monitor the bus activity but they don’t drive 


the SHLDA. Secondary 82395DXs do not drive the 


SADS # in Hold states. The Primary 82395DX drives 
the SBREQ signal in all System Bus cycles. In Line 
Fill cycles, the SBREQ signal is driven active one 
clock later than in other cycles. Of course, this is 
applicable for the case the System Bus is not avail- 
able. If the System Bus is available, the SBREQ will 
not be driven in Line fill cycles. For more details 
about system arbitration, refer to Chapter 6. 


4.11 System Description 


A 386 DX Microprocessor/ 82395DX-based system 
includes the processor, optional Local bus devices 
(e.g. 387 DX Math Coprocessor), cache system (one 
82395DX or more) and System Bus devices (memo- 
ry, |/O devices and other non-cacheable devices). 
The 82395DX is the interface between the Local 
Bus and the System Bus. 


A Local Bus address decoder must be used to gen- 
erate LBA# and NPI# signals, and a System Bus 
address decoder must be used to generate SKEN # 
and SWP# signals. | 


The 82395DX READYO# may be logically ORed 


with READYO#s of other Local Bus devices. How- 


ever, this is not required unless a Local Bus device, 
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Table 4.3 - Local Bus Signal Connections 
in Multi-82395DX Systems 


Primary Each 82395DX 
82395DX Only in the System 
cee ee 


CLK2 
DO-D31 
A2-A31 
RESET 
BEO-3 # 
W/R# 
D/C# 
M/lIO# 
LOCK # 
ADS # 
READY # 
LBA# 
NPI# 
PLUSH # 
A20M # 
CONF # 
READYO # 


other than 387 DX Math Coprocessor, exists on the 
local bus (82395DX generates a READY signal for 
the 387 DX Math Coprocessor). The 386 DX Micro- 
processor READY # input signal must also be driven 
to the 82395DX READYI# pin, so that the 82395DX 
will be able to track the Local Bus cycles correctly. 


To allow for expanding the cache system beyond 
16KB, up to four 82395DX devices may be connect- 
ed in parallel. Two 82395DX outputs are Wire-ORed 
between the parallel 82395DXs: READYO# and 
SADS #. Each of the 82395DXs’ CONF # input must 
be tied to a different signal, to program each one of ? 
them to a distinct address decoding. 


Figure 4.6 describes a maximum 386 DX Microproc- 
essor/82395DX system, with 387 DX Math Coproc- 
essor, four 82395DX devices, READY # generation 
logic and Local Bus/System Bus address decoders. 


Note that optional elements in Figure 4.6 are drawn 
with dotted line. The Local Bus includes CLK2, RE- 
SET, BE3#-BE0#, A2-A31, DO-D31, W/R#, 
D/C#, M/IO#, LOCK# and ADS#. The System 
Bus can be broken into two groups. Those pins con- 
nected only to the primary 82395DX (82395DX #1) 
and those connected to each 82395DxX in the sys- 
tem (82395DX #1-—#4). See Table 4.4. 
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Table 4.4 - System Bus Signal Connections | 
in Multi-82395DX Systems 


Primary Bach 82395DxX—s||s—S 
82395DX Only in the System : 


SA2-3 O SD0-31 
SW/R# O SA4-31* 
SD/C# O SADS # 
SM/IO# O SRDY # 
SLOCK # O SBRDY # 
SBREQ O SNA # 
SHLDA O SHOLD 
SBLAST # O SAHOLD 
SNENE# | O SEADS # 
SFHOLD # 
SKEN # 
SWP # 


*SA4-31 are connected to each 82395DxX in the system 
for snooping purposes but are driven only by the primary 
82395DX. 
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Figure 4.6 - System Description 
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5.0 PROCESSOR INTERFACE 


The 82395DX runs synchronously with the 386 DX 
Microprocessor. It is a slave on the Local Bus, and it 
buffers between the Local Bus and the System Bus. 
Most of the 82395DX cycles are serviced from the 
internal cache, and some (82395DX cache misses, 
non cacheable accesses, etc.) require an access to 
the System Bus to complete the transaction. 


To achieve maximum performance, the 82395DX 
serves cache hits and buffered write cycles in zero 
wait- state, non-pipelined cycles. The 82395DxX re- 
quires that the CPU is never driven to pipelined cy- 
cles, i.e. the 386 DX Microprocessor NA#¥ input 
must be strapped to inactive (high) state. 


The 82395DxX is directly connected to all local bus 
address and data lines, byte enable lines, and bus 
cycle definition signals. The 82395DX returns 
READYO# to the 386 DX Microprocessor, and 
keeps track of the 386 DX Microprocessor cycle 
status by receiving READYI# (which is the 386 DX 
Microprocessor READY #). 


A multi 82395DX system description was presented 
in chapter 4. 


5.1 Hardware Interface 


The 82395DX requires minimal hardware on the Lo- 
cal Bus. Other than the 386 DX Microprocessor and 
other Local Bus resources (i.e. 387 DX Math Co- 
processor) and the 82395DX(s) (1-4 depending on 
the system). Ready logic and a Local Bus decoder 
are optional since the user can wire OR the READ- 
YO#s and tie LBA# and NPI# high if no addresses 
are to be local or non-buffered. The SRAM and buff- 
ers have been integrated on chip to simplify the de- 
sign. Refer to Figure 4.6. 


5.2 Nonpipelined Local Bus 


The 82395DX does not pipeline the Local Bus. 
READYO# gets returned to the 386 DX Microproc- 
essor one cycle after SRDY # or SBRDY # are driv- 
en into the 82395DxX after the first DW of a Line Fill. 
This allows the Local Bus to be free to execute 386 
DX Microprocessor cycles while the System Bus fills 
the cache line (see chapter 6).This takes away the 
advantage gained by pipelining the Local Bus. 
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5.3 Local Bus Response Hit Cycles 


The 82395DX’s Local Bus response to hit cycles are 
described here: 


1) Cache Read Hit (CRDH) Cycle — READYO# 
gets returned in T2. The Data is valid to the 
386 DX Microprocessor on the rising edge of 
CLK2. 


2) Cache Write Hit (CWTH), Buffered — Like in 
CRDH cycles the 82395DX returns READYO # in 
T2 so that the cycle runs with zero wait states on 
the Local Bus. The write cycle is placed in the 
write buffer and will be performed when the Sys- 
tem Bus is available. If the System Bus is on 
HOLD up to four write cycles can be buffered be- 
fore introducing any wait states on the Local Bus. 


3) CWTH, Non-Buffered — In the case of a non-buff- 
ered write hit cycle the write buffers can not be 
used so the 386 DX Microprocessor must wait un- 
til the System Bus is free to do the write. 
READYO # is returned to the cycle after SRDY # 
is driven to the 82395DX. 


5.4 Local Bus Response to Miss 
Cycles 


In a Cache Read Miss (CRDM) cycle a Line Fill is 
performed on the System Bus. READYO#¥ is re- 
turned to the 386 DX Microprocessor one cycle after 
SRDY # or SBRDY # for the first DW of the Line Fill 
is driven into the 82395DX. 


5.5 Local Bus Control Signals — 
ADS #, READY! # 


ADS # and READYI|# are the two bus control inputs 
used by the 82395DX to determine the status of the — 
Local Bus cycle. ADS# denotes the beginning of a 
386 DX Microprocessor cycle and READYI# is the 
386 DX Microprocessor cycle terminator. 


ADS# active and M/IO# = 1 invokes a look-up 
request to the 82395DX’s cache directory; the look- 
up is performed in T1 state. The Cache Directory 
access is simultaneous with all other cycle qualifica- 
tion activities, this Way the hit/miss decision be- 
comes the last in the cycle qualification process. 
This. parallelism enhances performance, and en- 
ables the 82395DX to respond to ADS# within one 
clock period. If the cycle is to a Local Bus device 
(LBA# asserted) or is non-cacheable, the hit/miss 
decision is ignored. | 
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5.6 82395’s Response to the 386 DX 
Microprocessor Cycles . 


Tables 5.2 - 5.4 show the 82395DX’s response to 
the various 386 DX Microprocessor cycles. They de- 
pict the activity in the internal cache, cache directo- 
- ry, the System Bus and write buffers in response to 
various cycle definition signals. Special cycles such 
as: LOCK, HALT/SHUTDOWN, WP, LBA, NPI are 
discussed separately below. 


5.6.1 LOCKED CYCLES 


The 386 DX Microprocessor LOCK #ed cycles are 
all those cycles in which LOCK# is active. The 
82395DX forces all LOCK #ed cycles to run on the 
System Bus. The 82395DxX starts the LOCK # ed cy- 
cle after it has emptied its write buffers. 


If the LOCK#ed cycle is cacheable the 82395DX 
will respond as follows (see table 5.2): 


Cache Read Miss (CRDM) — handled similar to a 
non cacheable cycle. 


Cache Read Hit (CRDH) — handled similar to a non 
cacheable cycle (LRU bits are not updated). 


Cache Write Miss (CWTM) — the cache is not up- 
dated, the write is not buffered. 


Cache Write Hit (CWTH). — the éache is updated if 
the line is not write protected. The write is not buff- 
ered. Note that this write is not buffered even though 
it is cacheable. The LRU mechanism is updated. 


lf the LOCK #ed cycle is non-cacheable (e.g. 1O cy- 
cle, INTA cycle) then it will be performed as a com- 
mon non-cacheable cycle with the addition of as- 
serting SLOCK# on the System Bus. 


Conceptually, a LOCK# cycle on the Local Bus is 
reflected into an SLOCK # cycle on the System Bus. 
Detailed timing considerations were presented in 
chapter 3. SLOCK# becomes inactive only after 
LOCK # has become inactive. If there are idle clocks 
in between the LOCK #ed cycles but LOCK # is still 
active - SLOCK# will remain active as well. A con- 
sequence of this is that SLOCK # is negated one 
clock after LOCK # is negated. 


During LOCK #ed cycles on System Bus (i.e. when 
SLOCK# signal is active), the 82395DX does not 
acknowledge hold requests so the whole sequence 
of LOCK#ed cycles will run without interruption by 
another master. 
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_ Note that when a LOCK #ed LBA# cycle runs on 


the Local Bus, and the System Bus is idle and not at 
HLDA state, SLOCK# will be asserted even though 
the LBA# cycle will not be transferred to the system 
bus. | 


5.6.2 1/0, HALT/SHUTDOWN 


I/O and HALT/SHUTDOWN cycles are handled as 
non-cacheable cycles. They are neither cached nor 
kept in the write buffer. The 386 DX Microprocessor 
HALT/SHUTDOWN cycles are memory write cycles 
to code area (ie. M/IO#=1, D/C#=0). The 
82395DX completes I/O and HALT/ SHUTDOWN 
cycles by returning READYO#, after recaiving: the 

SRDY #. | 


5.6.3 LBA# CYCLES 


LBA# cycles are all the 386 DX Microprocessor cy- 
cles in which LBA# is active, or all cycles in which 
the 387 DX Math Coprocessor or Weitek 3167 Float- 
ing-Point Coprocessor is addressed. A CPU access 
to |/O space with A31=1 is decoded as a 387 DX 
Math Coprocessor access. A CPU access to memo- 
ry space COOO0000H through C1FFFFFFH is decod- 
ed as a Weitek 3167 Floating-Point Coprocessor ac- 
cess, provided that the Weitek decoding is enabled. 


When an LBA# cycle is detected all other attributes 
are ignored. If a 387 DX Math Coprocessor access is 
decoded, READYO# is activated as described in 


section 5.6. No other activity takes place. 


5.6.4 NPI# CYCLES 


NPI# cycles are all the 386 DX Nierouiceeesok 
memory write cycles in which NPI# is active. In re- | 
sponse to a cycle with NPI# active, the 82395DX 
first executes all pending write cycles in the write 
buffer (if any), and then executes the current write 
cycle on the System Bus. READYO # is returned to 
the CPU only after SRDY # for the current write cy- 
cle is returned to the 82395DX. 


All NPI# cycles must have at least one wait state on 
the System Bus or be done to non-cacheable mem- 
Ht 3 | 


NPI# is ignored for read cycles, as well as all write 
cycles that cannot be buffered. 
5.6.5 LBA#/NPI# TIMING 


These inputs must be valid throughout the 386 DX 
Microprocessor bus cycle, namely in T1 and all T2 
states (See Figure 5.1). 
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Figure 5.1 - Valid Time of LBA# and NPI # 


5.7 82395DX READYO# Generation 


The 82395DX READYO# generation rules are listed 
below: 


CRDH cycles (non-LOCK#ed), READYO# is acti- 
vated during the first T2 state, so the cycle runs with 
zero wait states. 


CRDM cycles - READYO # is returned one clock af- 
ter the first SRDY # or SBRDY #. 


Non cacheable reads - READYO#¥ is returned one 
clock after SRDY # or SBRDY #. 


All cacheable writes (with the exception of 
LOCK #ed writes) are buffered. These cycles may 
be divided into two categories: 


(a) The first four write cycles — while the write buffer 
is not fully exploited. READYO# is returned in 
zero wait states. The address and the data are 

- registered in the. write buffer. 


(b) When the write buffer is full — READYO# is de- 
layed until one clock after the SRDY# or 
SBRDY # of the first write cycle in the buffer. In 
other words the fifth write waits until there is one 
vacant entry in the write buffer. 
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Non cacheable writes (plus LOCK#ed writes) — 
these writes are not buffered. READYO # is returned 
one clock after SRDY# or SBRDY# of the same 
cycle. 


READYO# activation during SRAM mode is de- 
scribed in Chapter 7. READYO# activation during 
self configuration is listed in Chapter 4. 


In all 387 DX Math Coprocessor accesses, the 
82395DX monitors the READYI#. If it wasn’t activat- 
ed immediately after ADS #, READYO# will be acti- 
vated in the next clock i.e. a one wait state cycle. So, 
the 82395DX READYO# can be used to terminate 
any 387 DX Math Coprocessor access. 


Note that the timing of the 82395’s READYO# gen- 
eration for 387 DX Math Coprocessor cycles is in- 
compatible with 80287 timing. When activated, 
READYO# remains active until READYI# is sam- 
pled active. This procedure enables adding control 
logic to control the 386 DX Microprocessor REA- 
DY|# generation (see Figure 5.2). 


ADS# 


READYO# 


READYI# 
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Figure 5.2 - Externally Delayed READY 


In a multi-82395DX system, each device on the Lo- 
cal Bus must be able to return READYO#. There- 


- fore, READYO# is wired OR on the Local Bus. 


READYO# is normally floated, and it is connected 
to the positive power supply by a pull-up resistor. An 
external OR gate ORs the 82395DXs’ READYO#s 
with the READYO# of all other Local Bus devices. 


5.8 A20 Mask Signal 


The A20M# signal is provided to allow for emulation 
of the address wraparound at 1 MByte which occurs 
on the 8086. A20M# pin is synchronized internally 
by the 82395DX, then ANDed with the A20 input pin. 
The product of synchronized A2Z0OM# and A20 is 


82395DX 
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presented to the rest of the 82395DX logic, as 
shown in Figure 5.3.. 


A20M# must be valid two clock cycles before 
ADS # is sampled active by the 82395DX, and must 
remain valid until after READYI# is sampled active 
(see Figure 5.4). | 


82395 DX 


Internal 
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- Figure 5.3 - A20 MAsk Logic 
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Figure 5.4 - Valid Time of A20M# 


5.9 82395DX Cycle Overview 


Table 5.1 - 386 DX Microprocessor Bus Cycle Definition 


Interrupt Acknowledge 
Undefined 
I/O Read 


Memory Code Read | 
Halt/Shutdown ; 
| Memory Data Read 


386 DX Microprocessor Cycle Definition | 


VOWrite | 


Memory Data Write 
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Table 5.2 describes the activity in the cache, in the Tagram, on the System Bus and in the write buffers. The 
cycles are defined in table 5.1. Table 5.2 is sorted in a descending order. The more dominant the attribute the 
higher it is located. For example, if the cycle is both LBA# and I/O it is considered an LBA# cycle. Table 5.2 is 
for non test modes. 


) 
3 


Ere ii 
nm 3 


Cache 


Non Cacheable Cycle 


Memory Write 


Memory Write 


- 
rs 


Update | Line Fill 


Noncacheable Read 
No Line Fill 

Memory Write 
Memory Write 
Memory Write 
Memory Write 


Table 5.2 - Activity by Functional Groupings 
cate TAGRAM 
cle e 
y yp TAG 
1. LBA & 387/Weitek Cycles 
2. |/O Write, 
|/O Read, 
Halt/Shutdown, 
INTA, LOCK #ed Read 
3. LOCK #ed Write Hit | | Update 
4. LOCK #ed Write Hit Cache | Update 
| Write 
5. LOCK#ed Write Miss |N/A] — | 
6. Other Read Hit N/A | Cache | Update 
| Read 
7. Other Read Miss N/A | Cache | Update 
SKEN# Active Write 
8. Other Read Miss N/A 
SKEN # Inactive 
9. Other Write Hit . Update 
NPI# Inactive 
10. Other Write Hit Update 
NPI# Active 
_11. Other Write Hit Cache | Update 
NPI # Inactive Write 
12. Other Write Hit Cache | Update 
NPI# Active Write 
13. Other Write Miss N/A 
NPI # Inactive 
14. Other Write Miss N/A 
NPI# Active 
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- Table 5.3 describes line buffer hit cycles. Hit/miss here means to the specific DW in the line buffer. 


Table 5.3. Activity in Line Buffer HitCycles 


TAGRAM | Posted 
: System Bus : 
LRU | TAG | Write 


Memory Write 


— Memory Write 


d Wr ce 
Lane 
mt 
a a ee 
i | | -Memory Write [=i | 
2 ad dD cd 
Rested De i 
Table 5.4 describes the line ata hit cycles, uae the Line Fill is interrupted (by: ries, snoop hit to the 
line buffer or interrupted burst, even if the Line Fill continues on the System Bus in the first two cases). The 


table includes only the cycles which wait to the end of the Line Fill or to the CPU cache update. Hit/Miss here 
means to the right DW in the line buffer. 7 co, 


_ Table 5.4. Activity in the Line Buffer During ALF Cycles 


| __TAGRAM dl egies Posted 
: ystems | Write | 


Cycle Type 


> 


24. Read Miss N/A Cache Update Replace Line Fill N/A 
(Restart) — i Write . is 
25. Other Write N/A | Memory Write Yes | 
_ NPI# Inactive | _ 
26. Other Write N/A Memory Write 
NPI# Active | | 
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Table 5.5 depicts the 82395DX Test Cycles. 
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Table 5.5. <a aa in Test Cycles 


| 27. High Impedance N/A 


28. SRAM Mode Read N/ 
Add 256K-512K 


> 


> 


> 


32. SRAM Mode Read N/A | N/ 
Add <> 256K-512K 
33. SRAM Mode Write N/A | N/ 
Add <>256K-512K 
Remarks for Tables 5-2 ouch 5-5: 


‘1. READYO# is active in the first T2. (In read cy- 
cles, in write it depends if the write buffer is full). 


2. READYO# is active one clock cycle after 
SRDY #/SBRDY# of this cycle is asserted. In 
case of Line Fill, READYO# is active one clock 
cycle after first SRDY # /SBRDY # of this cycle is 
asserted. 


3. READYO # is active immediately after the current 
line fill is finished. 


4. READYO # is active after the previous line fill and 
the write cycle are terminated by SRDY# or 
SBRDY #, and the cache is updated. 


5. READYO # is active after the cache is updated for 
the previous Line Fill, or after the Line Fill is abort- 
ed. 

6. READYO # is active on the third T2 (2 wait states) 
if the write buffer is not full. 


7. “OTHER” means the cycle does not fall within 
the first five categories. 


6.0 SYSTEM BUS INTERFACE 


The System Bus (SB) interface is similar to the 386 
DX Microprocessor interface. It runs synchronously 
to the 386 DX Microprocessor clock. In general, the 
interface is similar to the 82385 in terms of: System 
Bus pipelining, snooping support and write cycle 
buffering. In addition, the following enhancements 
are provided: 


1) Line Fill buffer. 
2) Optional burst Line Fill. 


| TAGRAM 
29. SRAM Mode Read N/A| 1. | Cache 
Add 256K-512K - 
| 30. SRAM Mode Write N/A LRU WR | TAG WR 
Add 256K- 512K 


Posted 
System Bus 


= 
eal 


ai 
Noncacheable No 2 | 
Cycle 
Noncacheable N/A 
Cycle 
3) System cacheability attribute, SKEN#. 


4) System Write Protection attribute, SWP #. 


5) The SBREQ/SHOLD/SHLDA arbitration mecha- 
nism to support multi master systems. 

6) The SEADS# snooping mechanism to support 
concurrency on the System Bus and on the gen- 
eral purpose bus. 

7) SFHOLD# mechanism to resolve deadlocks in 
multiprocessing systems. 


8) Four Double-Word write buffer (16 bytes). 


9) SNENE # (System NExt NEar) function to simplify 
the design of page mode DRAM system, and 
save wait states. 


The 82395DX System Bus interface has identical 
bus signals to the 386 DX Microprocessor bus. It has 
the bus control signals (SADS#, SRDY# and 
SNA#), the cycle definition signals (SLOCK#, 
SW/R#, SD/C# and SM/IO#), the address and 
byte enable signals (SA2-SA31 and SBEO#- 
SBE3#) and the data signals (SDO-SD31). In addi- — 
tion, the 82395DX has the SBRDY # signal for burst - 
support. The SKEN# signal for the system cachea- 
bility attribute. The SWP# signal for the system 
Write Protection attribute. The SAHOLD and 
SEADS# signals for snooping support. The SBREQ, 
SHOLD and SHLDA signals for system arbitration. 
And SNENE# for DRAM hook-up. Also, the 
82395DX provides a signal, SBLAST #, which when 
asserted, indicates that the current cycle is the last 
cycle in a burst transfer. 
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The 82395DX System Bus interface can support any 
device, non cacheable, I/O or cacheable memory 
with any number of wait states. The 82395DxX is able 
to support one clock burst cycles. The 82395’s Sys- 
tem Bus state machine is similar to the 386 DX Mi- 
croprocessor bus state machine (refer to the 


SHOLD Asserted 


“386 DX Microprocessor data sheet’). Note that 
during burst Line Fill, the 82395DX remains is ST2 
‘state until SRDY# or SBRDY# is asserted for the 
fourth cycle of the burst transfer. Figure 6.1 de- 
scribes the 82395’s System Bus state machine. 


RDY Asserted - 
SHOLD Asserted 


SHOLD Asserted 


RESET 
Asserted 


SHOLD Negated - 
Request Pending 


SHOLD Negated ° 


Se) 
tC) 
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re) 
ox 
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RDY Asserted ° 
SHOLD Negated ° 
Request Pending | 


RDY Asserted ° 
SHOLD Negated ° 
No Request 


RDY = SRDY# + SBRDY# 


RDY Asserted - 
SHOLD Negated - 
No Request 


Request Pending RDY Asserted - 
SHOLD Negoated ° 


Request Pending RDY Negated - 


RDY Asserted - 
SHLDA Asserted 


82395DX 


SNA Negated 


SNA# Negated 


(No Request + 
SHOLD Asserted) ° 
SNA Asserted ° 
RDY Negated 
(Request Negated + _ 
SHOLD Asserted) ° 
SNA# Asserted 


RDY Negated ° 
(No Request + 
SHOLD Asserted) - 


=, 


RDY Negated 


RDY Negated ° 

Request Pending~ 
SHOLD Negated 

RDY Negated - 
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SHOLD Negated - 
Request Pending 

RDY Asserted 
SNA# Asserted . SHOLD Negated . Request Pending 
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Figure 6.1 - SB State Machine | 
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6.1 System Bus Cycle Types 6.1.1 BUFFERED WRITE CYCLE 


Following five types of SB cycles are supported: All the cacheable write cycles, except LOCK #ed 
write cycles or non-buffered write cycles (as indicat- 

ASS Ureled write eye ed by NPI# pin sampled active), are buffered. These 

2) Non buffered write cycle cycles are terminated on the Local Bus before they 

3) Buffered/non-buffered write protected cycles. are terminated on the System Bus. 

aynon cacheable read: cyCle The following Figures (6.2 - 6.3 ) include waveforms 

5) Cacheable read cycle of several cases of buffered write cycles: 


The 82395DX has a four DW deep write buffer but 
five writes cycles can be buffered if one of the buff- 
ered writes is being executed. 


Ti i i 
ST2 ST2 STi 
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Figure 6.2 - Single Buffered Write Cycle 
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NOTE: . 
READYO# #6 waits until SRDY# #1 is sampled 


Figure 6.3 - Multiple Buffered Write Cycles During System Bus HOLD 
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6.1.2 NON-BUFFERED WRITE CYCLE. . The following Figures (6.4 - 6.5 ) include waveforms 
of several cases of non buffered write cycles. 
These cycles are terminated on the System Bus one ae 7 | , 
clock before they are terminated on the Local Bus. 


71 72 #2) «#«%T2)~«6«T2~:~«zSTi 
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Figure 6.4 - I/O Write Cycle 


| locked read | locked write 
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NOTE: 
| While SLOCK# is active SHOLD input is ignored 
_ Figure 6.5 - LOCK #ed “Ready Modify Write” cycle 
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6.1.3 WRITE PROTECTED CYCLES 


Ti T2 12 #%T2~~ «72 Ti 

The Write Protection attribute is provided by the sys- STi STH) ST2--> STZ. STi ST 

tem bus SWP# input. The SWP# is sampled with nie ; | 

the first SRDY # or SBRDY # in every Line Fill cycle. 

The write protection indicator is registered in the  ADS# Sp 

Cache Directory together with the TAG address and HI 

TAG Valid bit of every line. In every cacheable write READYO# Pie 

cycle, the write protection indicator is read simulta- 

neously with the Hit/Miss decision. If the write cycle ree nag pi 

is a hit and the write protection indicator is set, the SRDY# 

cache will not be updated. In all other cases, the 

write protection indicator is ignored. SKEN# sis soheeiieetnelees 
SBLAST# 


6.1.4 NON-CACHEABLE READ CYCLE | 
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Figure 6.6 - 1/O Read Cycle | 


Non cacheable read cycles are terminated on the 
System Bus one clock before they are terminated on 
the Local Bus. 


The following Figures (6.6 - 6.7 ) include waveforms | 
of several cases of non cacheable read cycles. 


INTA 
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NOTE: 
While SLOCK # is active SHOLD input is ignored 
Even if the System Bus is in its idle state, SLOCK# is active because LOCK# is active. 


Figure 6.7 - INTA LOCK # ed Cycle 
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6.1.5 CACHEABLE READ MISS CYCLES 


The 82395Dx attempts to start a Line Fill for non 
LOCK #ed CRDM cycles. However,.a Line Fill will be 


~ converted into a single read cycle if the access is — 


indicated as non-cacheable by the SKEN # signal. 


CRDM cycles start as a System Bus read cycle. 
READYO# is returned to the 386 DX Microproces- 
sor one clock cycle after the System Bus read cycle 
is terminated. 


One CLK cycle before the first SNA#, SRDY# or 
SBRDY # of the system read cycle, the SKEN # in- 
put is sampled. If active, the read miss cycle contin- 
ues as a Line Fill cycle, and three additional DWs 
are read from the memory into the 82395DxX. Also, 
the SWP# input will be sampled with the first 
SNA#, SRDY# or SBRDY# so the WP flag of the 
line will be updated in the Cache Directory: 


T1 T2 
STi ST1 


CLK 
ADS# 
READYO# 
SADS# 
SRDY# 


SBRDY# 


82395DX 


T2 
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6.1.5.1 Aborted Line Fill (ALF) Cycles. 


The System Bus can respond that the area of mem- 
ory included in a particular request is non-cacheable, 
by returning SKEN# inactive. As soon as the 
82395DX samples SKEN # inactive, it converts the 
cycle from a cache Line Fill, which requires addition- 
al read cycles to be completed, to a single cycle. 


In this case SBLAST# will stay active. Also, the 
82395DX will not generate another system cycle for 
the same Line Fill, because the cycle has already 


been finished by the first SBRDY # or SRDY # after 


SKEN# was sampled inactive. 


The following Figure 6.8 includes waveforms of an 
ALF cycle. | 
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Figure 6.8 - Aborted Line Fill cycle 
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Figure 6.9 - Line Fill Without Burst or Pipeline 
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Burst Mode Line Fill followed by a Line Buffer Hit Cycle 
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"6.1.5.2 Line Fill Cycles 

A Line Fill transfer consists of four back to back read 
cycles. Three types of Line Fill cycles are supported: 
1. Non pipeline, Non burst, SNA # inactive. 

2. Pipelined, non burst, SNA# active. | 


3. Burst, non pipelined, SNA# inactive, SRDY # in- 
active, SBRDY # active. 


Note that a pipelined burst cycle is not supported. 


When SNA# is sampled active, SBRDY # is Weated 3 


as SRDY #. 


‘The 82395DX supports burst cycles in system Line 
Fills only. Burst cycles are designed to allow fast line 
fills by allowing consecutive read cycles to be exe- 
cuted at a rate of one DW per clock cycle. In burst 


cycles SADS# is pulsed for one clock cycle while 
the address and control lines are valid until the — 


transfer is completed. SA2~3 are updated every bus 
cycle during the burst transfer. 


The 82395DxX starts the Line Fill as a normal read. 


cycle, and waits for SBRDY # or SRDY# to be re- 


turned active. If SNA# is sampled active at least — 


one clock cycle before either SBRDY# or SRDY#, 


the Line Fill will be non burst pipelined. (See Figure - 


6.10). If SNA# is sampled active at the same clock 
cycle as SBRDY# or SRDY*#, the line fill will be 
non-burst, non-pipelined. 


lf SKEN# is sampled inactive one clock before ei- 
ther SNA#, SBRDY# or SRDY#, then the access 
is considered non-cacheable and Line Fill will not be 
executed. (See Figure 6.8) Otherwise, if SRDY # is 
sampled active, the line fill cycle resumes as a non- 
burst sequence of three more cycles (see Figure 
6.9). Finally, if SBRDY# and SKEN# are sampled 


active (and SNA# and SRDY# are sampied inac- 
tive), then the Line Fill cycle will be a burst cycle 


(see Figures 6.11 - 6. 2). 


If a system cannot support burst cycles, a non burst 
line fill must be requested by merely returning 
_ SRDY# instead of SBRDY #, in the first read cycle 
(see Figure 6.9). Once a burst cycle started, it will 
not be aborted until it’s completed, regardless if 
SKEN# is sampled inactive or SHOLD is sampled 
active, i.e. all four DWs will be read from memory. 


- 82395DX 
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However, the system may abort a burst Line Fill 
transfer before it’s completed, by returning SRDY # 
active (instead of SBRDY #) for the second or third 
DW in a Line Fill transaction (see Figure 6.13). In this 
case the cache will not be updated. The first DW will 
already have been transferred to the CPU. 


Note that in the last (fourth) bus cycle in a line fill 
transfer, SBRDY# or SRDY# has the same effect. 
on the 82395DxX. That is to indicate the end of the 
Line Fill. For all cycles that cannot run in burst mode 
(non-Line Fill cycles or pipelined Line Fill cycles) 
SBRDY # has the same effect on the 82395DX as 
the normal SRDY # pin. SRDY# and SBRDY # are 
the same apart from their function during burst cy- 
cles. 


The fastest burst cycle possible requires two clocks 
for the first data item to be returned to the 82395DX 
with subsequent data items returned every clock. 
Such a bus cycle is shown in Figure 6.11. An exam- 
ple of a burst cycle where two clocks are required 
for every burst item is shown in Figure 6.12. When 
initiating any read, the 82395DX presents the ad- 
dress for the data item requested. When the 
82395DX converts this cycle into a cache Line Fill, 
the first data item returned must correspond to the 
address sent out by the 82395DX. This address is 
the original address that is requested by the 386 DX 
Microprocessor. The 82395DX updates this address 
after each SBRDY # according to table 6.1 (SA2 and 
SA3 are updated). This is also true for non-burst 
Line Fill cycles. The 82395DX presents each re- 
quest for data in an order determined by the first 
address in the transfer. For example, if the first ad- 
dress was 104, the next three addresses in the burst 
will be 100, 10C, and 108. The burst order used by 
the 82395DX is shown in Table 6.1. This remains 
true whether the external system responds with a 
sequence of normal bus cycles or with a burst cycle. 
An example of the sequencing of burst addresses is 
shown in Figure 6.12. 


This order was designed to optimize the perform- 
ance of 64- bit memory systems. The second cycle 
of a burst reads the DW that forms the other half of 
an aligned 64-bit block, no matter whether that DW 
is at a higher or lower address. The third and fourth 


‘cycles then read the two DWs which form the other 


half of an aligned 128-bit block. The order in which 
the third and fourth DWs are accessed corresponds 
to the order used for the first and second DWs. 
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Table 6.1 - Line Fill Address Order 


First Address Second Address Third Address Fourth Address 
C 


a ee ae ee ee ee 

ee Oe OE ee ee ee 
Oe ee 
a ee 


In the following cases, a Line Fill cycle will not up- 3. FLUSH during Line Fill cycle: the Line Fill cycle 
date the cache: : will continue as usual, but the cache will not be 
1. Aborted burst: burst cycle will be aborted if updated. 


SRDY # is returned active in the second: or third ; | 
bus cycle. The Line Fill will not resume, and the Figures (6.9 - 6.13) include waveforms of several 


cache will not be updated. _ eases of Line Fill cycles. 


2. Snoop hit to line buffer: If, during a Line Fill trans- 
fer, a snoop cycle is initiated after the first 
SRDY# or SBRDY#, and the address matches 
the address of the line being retrieved, the Line 
Fill cycle will continue as usual but the cache will 
not be updated. 
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Figure 6.10 - Pipelined Line Fill 
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Figure 6.11 - Fastest Burst cycle (one clock burst) 
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Figure 6.12 - Burst Read (2 clock burst) 
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Figure 6.13 - Interrupted Burst Read (2 clock burst) 
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6.2 82395DX Latency in ee Bus 
| Accesses 


The 82395DX acts as a butfer between the 386 DX - 


Microprocessor and the main memory causing some 
latency in initiating the System Bus cycle (SADS# 
delay from ADS #) and in completing the cycle (386 
READYO# delay from SRDY# or SBRDY#). The 
82395DX drives the SADS# one clock after the 
ADS#. In cacheable cycles, the 82395DX starts 
driving the SADS# before it decides whether the 
cycle is a cache hit or miss since the hit/miss deci- 
sion is valid in the second clock (the first T2 cycle). 
In case the cycle is a hit, the 82395DX deactivates 
SADS#. This causes an undesirable glitch on the 
SADS # signal, and also it causes an SADS # timing 
incompatibility with the 386 DX Microprocessor i.e. 
SADS# delay is slightly longer than the ADS# de- 
lay. For proper system functionality, SADS# must 
be sampled by the next clock edge. | 


At the end of System Bus non- cacheable read cy- 
cle, or non- buffered write cycle, the 82395DxX drives 
READYO# active one clock after SRDY #. In a Line 
Fill cycle, READYO # is activated one clock after the 
first SBRDY # or SRDY # is sampled active. The set- 


up timing requirements of SRDY# and system data 


force one wait state at the end of the cycle. © 


6.3 SHLDA Latency 


For non-LOCK #ed cycles the worst case delay be- 
tween SHOLD and SHLDA would be when SHOLD 
is activated during ST2P state, followed by a Line 
Fill. In this case, the HOLD request will be acknowl- 
edged only after the Line Fill is completed. In 
LOCKed cycles SHLDA will not be asserted until af- 
ter SLOCK# is negated. The latency would be: 


Latency = (Number of ST2Pcycles) + (Number of 
Line Fill cycles) OR (Number of LOCK #ed cycles) 


6.4 Cache Consistency Support 


The 82395DX supports snooping using the 
SEADS# mechanism. Besides insuring the consist- 
ency, this mechanism provides multi processing sup- 
port by having the 82395DX System Bus and the 
Local Bus running concurrently. 


The 82395DxX will always float its address bus in the 
clock immediately following the one in which SA- 
HOLD is received. Thus, no address hold acknowl- 
edge is required. When the address bus is floated, 
the rest of the 82395DX’s System Bus will remain 
active, so that data can be received from a bus 
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cycle that was already underway. Another bus cycle | 
will not begin, and the SADS # signal will not be gen- 
erated. However, multiple data transfers for burst cy- 
cles can occur during address holds. 


A companion input to SAHOLD, SEADS # indicates 
that an external address is actually valid on the ad- 


_ dress inputs of the 82395DX. When this signal is 


activated, the 82395DX will read the external ad- — 
dress and perform an internal cache invalidation cy- 
cle to the address indicated. The internal invalidation 
cycle occurs one clock after SEADS# is sampled 
active. In case of contention with 386 DX Microproc- 
essor look up, the invalidation is serviced two clocks 
after SEADS# was activated. The maximum rate of 
invalidation cycles is one every other clock. Multiple 
cache invalidations can occur in a single address 
hold transfer. SEADS# is not masked by SAHOLD 
inactive, so cache invalidations can occur during a 
normal bus cycle...This also means that if SEADS # 
is driven active when the 82395Dx is driving the ad- 
dress bus, the values that are being driven by the 
82395DxX will be used for a cache invalidation cycle. | 


If the 82395DX is running a line fill cycle and an 
invalidation is driven into the 82395DX in the same 
clock the first data is returned, or in any subsequent 
Clock, the 82395DxX will invalidate that line even if it 
is the same cache line that the 82395DxX is currently 
filling. 


SAHOLD in pipelined cycles: The activation of SA- 
HOLD only causes the system address to be floated 
in the next clock without changing the behavior of 
pipelined cycles. If SAHOLD is activated before en- 
tering the ST2P state, the 82395DX will move into 
non-pipeline and drive the SADS# only after the de- 
activation of SAHOLD. However, if SAHOLD is as- 
serted in the ST2P state and the Nth cycle has al- 
ready started, the system address is floated but 
SADS # is kept active until SRDY# (for the N-1 th 
cycle) is returned. It is the system designers’ respon- 
sibility to latch the address bus. Note that the ad- 
dress driven on the System Bus after SAHOLD is 
deasserted (in pipelined cycles) depends on wheth- 
er SNA# has been sampled active during the 
SAHOLD state and another cycle is pending. As 
seen from Figure 6.14, the (N+ 1)th address will be © 
driven by the 82395DX once SAHOLD was deacti- 
vated and SNA# was sampled active, provided 
there is a cycle pending in the 82395DxX. The follow- 
ing figures describe the 82395DX behavior in two 
cases. First, when SNA# is sampled active and sec- 
ond, in the case of SNA# sampled inactive. 


Note that the maximum rate of snooping cycles is 
every other clock. The first clock edge in which 
SEADS # is sampled active causes the 82395DX to 
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ST2P ST2P STIP ST2i ST2P ST1IP 
CLK 


SAHOLD 
SADS# 
SA(2=31) 
SRDY# 


SNA 
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(A) - SNA# sampled active 


ST2P ST2P STIP ST2_ ST2 
CLK 
SAHOLD 
SADS# 
SA(2=31) 
SRDY# 


SNA# 
290382-40 
(B) - SNA# sampled inactive 


Figure 6.14 - SAHOLD Behavior in Pipelined Cycles 
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latch the system address bus and initiate a cache 
invalidation cycle. If SEADS# is driven active for 


more than one clock, only one snooping cycle will be 


initiated on the first clock edge at which SEADS # is 
sampled active. The SA2-31 setup and hold timings 
are specified to the same clock edge in which 
SEADS # is sampled active. 


6.5 Bus Deadlock Resolution Support 


In a multi-master system another bus master may 


require the use of the bus to enable the 82395DX to 
complete it’s current bus request. In this situation, 
the 82395DxX will float it’s entire System Bus until the 
other bus master has completed it’s bus transaction. 


The 82395DxX will float it’s System Bus immediately 
in response to the external system asserting the 
Fast HOLD (SFHOLD#) signal. The only effect of 
this signal being sampled active is forcing the 
.82395DX System Bus pins to float. It is the system 
designer’s responsibility to ensure that no 82395DX 
cycle is prematurely terminated, and that no new 
82395DX cycle is generated during Fast HOLD. 
When SFHOLD # is deasserted the System Bus ad- 
dress, cycle definition and data are redriven by the 
82395DxX and the cycle is not restarted. SRDY # and 
SBRDY# are not recognized during SFHOLD# 
states. SFHOLD# asserted internally disables 
SRDY # and SBRDY#. | 


386 DX 
i Microprocessor 


82395DX 
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6.6 Arbitration Mechanism. 


As more than one device may be connected to the 
shared system bus, there is a need for arbitration 
between the devices that wish to utilize the shared 
resource. The 82395DX supplies the interface sig- 
nals to an external arbiter (either centralized or dis- 
tributed) to enable it to perform the task. 


The 82395DX provides a normal bus SHOLD/ 
SHLDA handshake protocol, exactly as the 386 DX 
Microprocessor does on the Local Bus. SHOLD is 
used to indicate to the 82395DX that another bus 
master desires control of the 82395DX System Bus. 
Whenever the 82395DX completes its current bus 
cycle (a full line transfer if the cycle is a Line Fill), or 
sequence of LOCK #ed bus cycles, it will grant its 
external bus to the requesting device by floating it 
and by driving SHLDA active. The 82395DxX will re- 
linquish its System Bus at the end of a bus cycle, 
even if it has other cycles internally pending. As 
soon as the 82395DX responds with SHLDA, it tri- 
states all bus control and address outputs. Now, if 
the System Bus is required by the 82395DX (on be- 
half of a 386 DX Microprocessor request on the Lo- 
cal Bus) but is not available, processing will cease. 
Then the 82395DX will have to re-arbitrate on the 


System Bus by driving SBREQ active. 


| 386 DX 
Microprocessor 


SHOLD 


Asynchronous SHLDA 
Arbiter 82595Dx 
| SBREQ 


System Bus 
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Figure 6.15 - Multiple 82395DX Bus Arbitration Scheme 
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The SBREQ output is activated whenever the 
82395DX has internally generated a bus cycle re- 
quest. It is inactivated immediately after the 
82395DX asserts SADS # of the cycle. By examining 
this signal, external logic can determine when the 
82395DX requires the use of the System Bus and 
intelligently arbitrate the System Bus among multiple 
processors. This pin is always driven, regardless of 
the state of bus hold (See Figure 6.16). 


The SHOLD input has higher priority than the pend- 
ing request. In the case of LOCK#ed System Bus 
cycles, SHOLD requests will not be acknowledged. 
Another case is a non-burst Line Fill, where SHOLD 
is acknowledged after reading the fourth DW, even 
though SHOLD was activated before. 


82395DX #1. STI. ST2 = ST2 
82395DX #2 STH STH STH 


CLK 
SADS,# 


SRDY,# 
SBREQ, 
SHOLD, 

SHLDA 


SADSo# 


SRDY># 


SBREQ, 
SHOLD, 


SHLDA, 
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6.7 Next Near Cycles 


For all System Bus cycles, the 82395DX generates a 
signal, SNENE #, to indicate that the current cycle is 
in the same 2048 Byte area as the previous memory 
cycle. Namely, it indicates that address lines A11- 
A31 of the current System Bus memory cycle are 
identical to address lines A11—A31 of the previous 
memory cycle. This signal can be used by an exter- 
nal DRAM system to run CAS# only cycles, there- 
fore increasing the throughput of the memory sys- 
tem. SNENE # timing is identical to system address 
timing, namely it is valid from SADS# active until 
SRDY# or SBRDY# is sampled active (non-pipe- 
lined cycles) or until SNA# is sampled active (pipe- 
lined cycles). SNENE# is valid for all memory cy- 
cles, and must be ignored in I/O and idle cycles. 


ST2. STH STH STH STH 
STH STH STH STi ST2 
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Figure 6.16 - SHOLD/SHLDA/SBREQ Mechanism 
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After the 82395DX exits the SHOLD state, SNENE # 
is always inactive. SNENE # is always inactive in the 
| first memory cycle after a Halt/Shutdown cycle. 


lf SAHOLD is sampled active while the System Bus 
is idle, the next 82395DX cycle will have SNENE# 
inactive. If SAHOLD is sampled active while the 
82395DxX is running a System Bus cycle, SNENE# 
will not change until the next SADS # is issued. Dur- 
ing SHLDA, SNENE# is floated and the first cycle 
after SHLDA is deactivated will have SNENE # inac- 


tive. SNENE# can run in the pipeline, the same as | 


the system address. 


6.8. Write Buffer 


The 82395DxX is able to internally store up to four 
write cycles (address, data and status information). 
All those write cycles will run without wait states on 
the Local Bus. They will run on the System Bus as 
soon as the bus is available. In case of a write cycle 
which cannot be stored since the buffer is full, the 


386 DX Microprocessor will be forced to wait until. 


one of the buffered write cycles is completed. 
READYO# is returned two CLK’s after SRDY # or 
SBRDY # is asserted if the write buffer is full. If the 


write buffer is not full READYO# is returned one . 


clock after SRDY # or SBRDY # is asserted. 


‘All non cacheable write cycles and LOCK #ed writes 


are not buffered. In this case, the 82395DX will acti-_ 
vate READYO# after getting the SRDY # for the 
- non buffered cycle. 


The write buffer maintains the exact original order of 
appearance of the Local Bus requests. It allows no 
reordering and no bypassing of any sort. 


7.0 TESTABILITY FEATURES | 


This chapter discusses the requirements for properly 


testing an 82395DX based system after power up 
and during normal system operation. 


7.1 SRAM Test Mode 


This mode is invoked by driving the FLUSH# pin 
active for less than four clocks during normal opera- 
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tion. SRAM test mode may only be invoked when 


the 82395DxX is in idle:state, namely there is no cycle 
in progress, and no cycle is pending in the 82395DxX. 
The 82395DX exits this mode with subsequent acti- 
vation of the FLUSH# pin for minimum of 1 clock 
cycle. If FLUSH # is activated for at least eight clock 
cycles during SRAM test mode, the 82395DxX will 
FLUSH # its cache directory in addition to terminat- 
ing the SRAM test mode. 


SRAM test mode is provided for system diagnostics 
purposes. In this mode, the 82395DX cache and 
cache directory are treated as a standard SRAM. 
The. 82395DxXs in the system are mapped into ad- 


- dress space 256K-512K of the 386 DX Microproces- 


sor memory space, and allows the CPU non-cache- 
able, non-buffered access to the rest of the memory 
and address space. Each 82395DX occupies 32KB 
of address space: 16KB for the cache and 16KB 
(not fully utilized) for the TAGRAM. The 82395Dx, in 
SRAM mode, will recognize 387 DX Math Coproces- 
sor/Weitek 3167 Floating-Point Coprocessor cycles 
and Local Bus cycles and handle them the same as 
it does in its normal mode. This way, the CPU may 
execute code that tests the 82395DX as a regular 
memory component, with the only limitation that no 
code or data may reside in the memory space 256K- 
512K during this mode. During SRAM test mode, all 
accesses to memory space other than 256K-512K 
are handled exactly as in normal mode with the fol- 
lowing exceptions: 


1. All read cycles are non-cacheable - read hits are 
not serviced from the cache and ad! misses 
don’t cause Line Fills. 


_ 2. All write cycles are not buffered. 


3. All write cycles do not update the cache. 


_. 4, Snooping is disabled. 


The local address pins indicate the 82395DxX inter- 


nal addresses. The partitioning is as follows: 


e A1i6=0 selects the cache directory. A16=1 se- 
lect the cache. : 


@ A15-14 select the ‘‘way”’. 


e Ai2 and A1i3 select one 82395DX in a multi 
82395DX system. 


e A11-—A4 are the set address. 


© A3-2 select a DW in the line. Applicable in cache 
accesses (A16= 1). 
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The user can write to any byte in any line in case of 
a cache write cycle and write to all the Tagram fields 
(25 bits) in one Way in one Tagram write cycle. The 
memory mapping of the SRAM mode is the de- 
scribed in Table 7.1. 


As can be seen from table 7.1, the address space 
allocated for either Tagram or Cache is 4096 (4k) 
addresses per way, per 82395DX. The address allo- 


cation within each 4K segment is shown in tables - 


7.2 and 7.3. 


The data presented on the 82395DxX local data pins 
is the SRAM data input. The SRAM data output is 
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also driven on the local data pins. The BE(O—3) # 
pins indicate the bytes which must be written. During 
SRAM test mode, ail the AC specifications are met. 
Figures 7.1 and 7.2 depict the SRAM mode read and 
write cycles respectively. Note that two wait states 
are inserted during SRAM test mode read cycles 
and one wait state is inserted in write cycles. The 
system may extend the number of wait states by 
gating READYO# for any number of clock cycles (1 
clock cycle in Figure 7.1, 0 clock cycles in Figure 
7.2). 


The user can write to any byte in any line in case of 
a cache write cycle and write to all the Tagram fields 
(25 bits) in one way in one Tagram write cycle. The 
memory mapping of the SRAM test mode described 
in table 7.1. 


Table 7.1 - SRAM Memory Map 


=NO OPH NOWALSF MO WALANWAALHANAALSHA NWO HPA NM ® 


coc0|++ += | wmnn OHOdWWIOodcde 
soealenen 


Cache/Tagram | Way | 82395Dx | Start Address 
4 


0005F000 h 
O0005E000 h 
0005D000 h 
0005C000 h 


0005B000 h 
0005A000 h 
00059000 h 
00058000 h 


00057000 h 
00056000 h 
00055000 h 
00054000 h 


00053000 h 
00052000 h 
00051000 h 
00050000 h 


0004F000 h 
0004E000 h 
0004D000 h 
0004C000 h 


0004B000 h 
0004A000 h 
00049000 h 
00048000 h 


00047000 h 
00046000 h 
00045000 h 
00044000 h 


00043000 h 
00042000 h 
00041000 h 
00040000 h 
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As can be seen from Tables 7.2 and 7.3, the ad- 
' dress. space allocated for either Tagram or Cache is 


4096 (4K) addresses per way, per 82395DX. The 


address allocation within each 4K segment is shown 
in table 7.2 for the Cache and ae 7.3 for the Ta- 
gram. 


Table 7.2 - Cache Address Allocation 


Start Address 


Double Word format in Tagram read/write: 
31 25 24 22 21 20 1 0 


joo 00000} tru |v TAG we 
V = TAG Valid bit | | | 

WP = Write Protect bit 

“0” = Indicates don’t care bits. Writing ¢ to these bits will 
have no effect. When reading the Tagram these bits will. 
have a value of 0. 
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os. | _ NOTE: 3 
In. Tagram accesses, BEO#- BES# are ignored in 
both (eae and write cycles. | 


The sae siesauied on the 82395DX Do- D31 pins is 
the SRAM data input for write cycles and is also the 
SRAM data output for read cycles during the SRAM 
test mode. The BE3 # -BE0 # pins indicate the bytes 
which will be written to. During SRAM test mode all 


‘the AC specifications are met. Figures 7.1 and 7.2 


depict the SRAM test mode read and write cycles 
respectively. The system may extend the number of 
wait states by gating READYO#. for any number of 
clock cycles (one clock cycle in Figure 7.1, zero in 
Figure 7.2). | 


CLK 
ADS# 
W/R# 


A(2=31) 


Bo-31 ) 


READYO# 


READYI# 


READYO# 
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Figure 7.2 - SRAM Mode Write Cycle 
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7.2 Tristate Output Test Mode 


The 82395DX provides the option of isolating itself 
from other devices on the board for system debug- 
ging, by floating all it’s outputs. Output tristate mode 
is invoked by driving the SAHOLD and FLUSH# 


CLK2 


RESET | 


SAHOLD 


FLUSH 


OUTPUTS 


pins active during RESET. The 82395DX will remain 
in this mode after RESET is deactivated, if SAHOLD 
and FLUSH# pins are sampled active in the CLK2 
prior to RESET going low (See Figure 7.3). The 
82395DX exits this mode with the next activation of 
RESET with SAHOLD or FLUSH # driven inactive. 
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Figure 7.3 - Entering the Tristate Test Mode 
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8.0 MECHANICAL DATA 


8.1 Introduction 


This chapter discusses the physical package and its 
connections. | 


8.2 Pin Assignment 


_ The 82395DxX pinout as viewed from the top side of 
the component is shown in figure 0.1. Voc and Vss 
connections must be made to multiple Vcc and Vss 
(GND) planes. Each Vcc and Vss pin must be con- 
nected to the appropriate voltage level. The circuit 
board must contain Voc and Vss (GND) planes for 
power distribution and all Voc and Vss pins must be 
connected to the appropriate planes. 


8.3 Package Dimensions and 
Mounting 


The 82395DX package is a196 lead plastic quad flat 
pack (PQFP). For information on dimensions refer to 
Table 8.1 and Figures 8.1—-8.3. 
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8.4 Package Thermal Specification 


The 82395DxX is specified for operation when the 
case temperature is within. the range of 0-85 °C. 
The case temperature may be measured in any envi- 
ronment, to determine whether the 82395DxX is with- 
in the specified operating range. The case tempera- 
ture must be measured at the center of the top sur- 
face opposite the pins. 


196 Pin PQFP Package Key Attributes: 


Electrical: 

L 6-20 nH (lead) 

L 3-6 . nH (Vcc/Vss) 
C <2.3 pF (Loading) 
C <1.6 pF (Id/Id) 

C 130-200 nH (Vcc/Vss) 
Thermal 

Oja 24 °C/W @2W 
Ojc 5 °C/W @2W 
Lead Stiffness: 

In-Plane 17 gm/mil 
Transverse 18 gm/mil 


Thermal characterization of the 196 lead PQFP 
package yielded the information contained in Fig- 
ures 8.4-8.6. 
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NOTES: 
1. Interpret dimensions and tolerances in accordance with ANSI Y14.5M-1982. 
2. Data enclosed in parentheses is for reference only. 


Figure 8.1 - Principal Dimensions and Data 
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Figure 8.2 - Typical Lead 


a (0.675, -0.625) 
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Figure 8.3 - Detail C 


Table 8.1 - Symbol List and Dimensions for 
196 Lead Plastic Quad Flat Pack Package 


> 


from the seating plane to 
the highest point of body. 


Standoff: The distance 
from the seating plane to 
the base plane. 


Overall Package 


Dimension: Lead tip to lead 
Bumper Distance 
Y14.5M-1982. 


an tip. 

Without FLASH 
NOTES: | 
2. Dimensions are in inches. 


D1,£1 | Plastic Body Dimension 1.347) 1.35 
With FLASH _ 

1. All dimensions and tolerances conform to ANSI 

3. Data enclosed in parenthesis is for reference only. 
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Figure 8.5 - Junction to Case Thermal Resistance vs Power 
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Figure @ 8.6 - Junction to Ambient Thermal Resistance vs Air Flow Rate 


9. 0 ELECTRICAL DATA. 


This chapter presents the A.C. and D. C. specifica- 
tions for the 82395DX. 


9.1 Power and Grounding 


The 82395DX has a high clock frequency and 108 
output buffers which can cause power surges as 
multiple output buffers drive new signal levels simul- 
taneously. For clean on-chip power distribution at 
high frequency, 15 Ve, and 17 Vs, pins separately 
feed power to the functional units of the 82395DxX. 


Power and ground connections must be made to alll 
external Voc and Vs, pins of the 82395DX. On the 
circuit board, all Voc pins must be connected on a 


Voc Plane and all Vsg pins must be connected on a_ 


GND plane. 


9.1.1 POWER DECOUPLING 
RECOMMENDATIONS 


Liberal decoupling capacitors must be placed near 
the 82395DX. The 82395DxX driving it’s 32 bit data 
buses and 30 bit system address bus at high fre- 
quency can cause transient power surges, particu- 
larly when driving large capacitive loads. Low induc- 
tance capacitors and interconnects are recommend- 
ed for the best high frequency electrical perform- 
ance. Inductance can be reduced by shortening cir- 
cuit board traces between the 82395DX and the de- | 
coupling capacitors as much as possible. 


9.1.2 RESISTOR RECOMMENDATIONS ~ 


The 82395DX does not have any internal pullup re- 
sistors. All unused inputs must be tied externally to a 
solid logic level. The outputs that require external 
pullup resistors are listed in table 9.1. A particular 
designer may have reason to adjust the resistor val- 
ues recommended here, or alter the use of pull-up 
resistors in other ways. 
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9.2 Absolute Maximum Ratings NOTICE: This data sheet contains information on 

products in the sampling and initial production phases 

Storage Temperature........... —65°C to 150°C of development. The specifications are subject to 
UNDEF BiaS sni2s svamgiocss tee -65°Cto 110°C * WARNING: Stressing the device beyond the “Absolute 
Supply voltage with Maximum Ratings” may cause permanent damage. 
Respect to Vgg .......00. ee eee. —0.5V to 6.5V These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
Voltage on Other Pins ....... —0.5V to Voc +0.5V tended exposure beyond the “Operating Conditions” 


may affect device reliability. 


Table 9.1 - Pullup Resistor Recommendations 
Purpose 


Lightly pull READYO # inactive in multi-82395DX systems. Allows the selected 
82395DxX to drive READYO # while it is inactive for the others. 


READYO# | 20K Ohms 
+10% 
Lightly pull SADS # inactive in multi-82395DX systems. Allows the selected 82395DX 


SADS # 20K Ohms 
—£10% to drive SADS ¥ while it is inactive for the others. 


SLOCK # 20K Ohms _| Lightly pull SLOCK # inactive for 82395DX SHOLD states. 
+10% 


9.3 DC SPECIFICATIONS Tease = 0°C to + 85°C, Vec = 5V +5% 
Table 9.2 - DC Specifications 


dion | Parameter |} 


Input Low Voltage 


vi | Input High Vottage [20 
VciL__| CMOS input Low [oa | oe | 
VCIH | CMOS input High 


| Signal | Pullup Value 


Test 
Conditions 


See Note 6 ; 

See Note 6 
[OutputtowVoltage | | SC 
[OutputHigh Voltage «| aa | CS See Note 2 


ee es 
ee eS 
[0 oF [See Noto 
[Power SuppiyGur Sid SSCidC See Note 3 
es 
aa 
eed 


VCOH CMOS Output High Voltage | Vec—0.45 | 


Input Leakage 


VeoL | CMOS Output Low Voltage |S 
| lO Output Leakage | 


Cap. Input 


3 
> 


VIL 

VIH 

VOL 

VOH | 

| ILI 

ILO . 

Cin 

ICC 

| 33MHz 

oa 25MH2 
a 20MH2 
NOTES: | 
1. This parameter is measured at |OL= 4mA for all the outputs. 
2. This parameter is measured at IOH= 1mA for all the outputs. 
3. Measured with inputs driven to CMOS levels, Voc = 5.25 V, Ta = O°C, using a typical pattern consisting of 33% read, 
write and idle cycles. 
4. CLK2 input capacitance is 20pF. 
5. No activity on the Local/System Bus. 
6. Applies to CLK2, READYO# inputs. 
7. Applies to READYO# output. 


S 
oO —, 
3 
pb [aisle 


3 
> 
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9.4 AC Characteristics | 


Some of the 82395DX AC parameters are clock-fre- 
quency dependent. Thus, while the part functions 
_ properly at the entire frequency range. specified by 
the t1 spec, the AC parameters are guaranteed at 
three distinct frequencies only: 20MHz,.25MHz and 
33MHz. Note that, for example, when a 33MHz part 
operates at 25Mhz CLK frequency, the AC parame- 
ters under ‘‘25MHz” column must be used. | 


e Functional operating range: VCC = 5V +5%, 
Tcase = 0°C to +85°C.. | | 


_@ All AC parameters are measured relative to 1.5V 
for falling and rising, CLK2 is at 2V. . 


e All outputs tested at a 50pF load. In case of over- 
_ loaded signals, the derating factor is 1ns for ev- 
ery extra 25pF load. 


e All parameters are referred to PHIt unless other- 
wise noted. 


e The reference Figure of CLK2 parameters and 
AC measurements level is Figure 9.1 and RESET 
and internal phase is Figure 3.2. 


9.4.1 TIMING CONSIDERATIONS FOR CACHE 
EXTENSIONS | 


| The values listed in Tables 9.3 and 9.4 for the AC 
parameters are valid for a design using one 
82395DX with its 16KB cache or two 82395DXs to 


extend the cache size to 32KB. For a design using | 


82395DX 


ADVANCE INFORMATION 


four 82395DXs to extend the cache size. to 64KB, 
some timing adjustments must be made due to the 
increased capacitive load on the signal traces. The 
capacitive derating curve (see Figure 9.6) must be 
used to accurately. determine the impact on AC tim- 
ings. 


tS 


Voc70.8V Wf Vcc 0.8V 
cua 5d osvA- 7 Foev 
t3aq <-!-> t4a 
t2 


A+ 


Al Max 


Valid yay. Valid 
Output n to ee Output n+1 
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Legend: 

A - Maximum Output Delay 

B - Minimum Output Delay . 

C - Minimum Input Setup Time 
D - Minimum Input Hold Time 


_ Figure 9.1 - Drive Levels and Measurement 
Points for AC Specifications 
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9.4.2 AC CHARACTERISTICS TABLES Tcase = 0°C to 85°C, Vcc = 5V +5% 
Table 9.3 - Local Bus Signal AC Parameters 


Symbol Parameter 


+ 
O1 
BAN 
3k 
jo) 


Operating Frequency 
CLK2 Period 
CLK2 High Time 
b CLK2 High Time | 
a CLK2 Low Time 
Ab CLK2 Low Time 
5 _—«|CLK2 Fall Time 
CLK2 Rise Time 
7a A2-A31 Setup Time | 
7b LOCK # Setup Time 
Cc BEO-3# Setup Time | 
A2-A31, BEO-3#, LOCK# Hold Time 
9a M/lO#, D/C#, W/R# Setup Time 
9»  |ADS# Setup Time. 
10 M/lO#, D/C#, W/R#, ADS# Hold Time 
11. |READYI# Setup Time 
12 READYI# Hold Time 
13 LBA#, NPI# Setup Time. 
14 | RESET Setup Time 
t15a | LBA#, NPI# Hold Time 
ti5b | RESET Hold Time 
16 DO-31 Setup Time 
17. |D0-31HoldTime _ 
18 DO-31 Valid Delay 
19 DO-31 Float Delay 
20  |READYO# Valid Delay 
1 READYO # Float Delay 
22 READYO# Setup Time | 
23. |READYO# Hold Time — 
24a | CONF # Setup Time 
t24b |CONF# Setup Time 
t25a |CONF# Hold Time 
t25b | CONF# Hold Time 


z<|< <= 
Ola D 
Oo | 2 se) 
” ” 
— ic Cc 
= = 
40) 4) 
Q. o. 
2 2 
Oo |] o 
(ee) ~ 
< <= 


15.4| 25 |15.4| 33 | MHz | Internal CLK 


re) 
no 
o 

8 
os) 
) 
N 
on 
ok, 
uo 
7) 
i) 
on 


a Measured at 2V 


ns | 
ns 
[ns | 
| ons | Measured at 2V 
ins | 
ns | 


~-- ~- ot - - - 


or 


pare 
Ni 


~- —« 
ay 
wh 
Be 


-r 


— —_ | —_ NO — | —- | NO NO 


ere 


=a 


tom ad 


-- 


Notes 4,5 


er | 


ig 


NO | @ [ee 


Pam ae a ee - = 
o;o;o;o 
= | eb - o> 
o;o;}oO | Oo 
O;1@m};o;® 
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Table 9.4 - System Bus Signal AC Parameters 


| SA2-31, SBEO-3#, SLOCK#, SD/C#, _ 
SW/R #, SM/IO# Valid Delay 


32 SA2-31, SBEO-3#, SLOCK#, SD/C#, 
SW/R #, SM/IO# Float Delay 


SBLAST #, SHLDA, SBREQ, SNENE # 
Valid Delay | 


Symbol Notes 


31 


-- 


ote 5 


Oo bg 


SD0-31 Write Data Valid Delay te 4 
SD0-31 Float Delay , Notes 4,5 


SA4-31 Setup Time 


wok 
N 
pe nh, 
~J wa 
wh | mt 


SDO-31 Read Setup Time | 
SDO-31 Read Hold Time 


SNA# Setup Time 


te 3 
Jote 3 


133 
134 
135 
136 
{37 
138 
39 
{40 
141 
t42 


SBLAST #, SNENE # Float Delay 


: 


t44 SHOLD, SKEN#, SWP#, SFHOLD#, 
SAHOLD Hold Time | 


t45a SEADS # Setup Time 
t45b SRDY #, SBRDY # Setup Time 
SEADS #, SRDY #, SBRDY# Hold Time 4 


SA4—-31 Hold Time 


= | AO _ 
Nj > oO 


°o 


ouch 


ook —" a" om, cas, awh, 


_ 
oO 


t 
t 
t 
t 


46 
47 
48 
49 
t50 


Notes 4,5 


ol 
= | os 
or}; oO 


nm |p 


NOTES: 
1. Tf is Measured at 3.7V to 0.8V. Tf is not 100% tested. 

2. Tr is Measured at 0.8V to 3.7V. Tr is not 100% tested. 

3. The specification is relative to PHI2 i.e. signal sampled by PHI2. 

4. The specification is relative to PHI2 i.e. signal driven by PHI2. | 

5. Float condition occurs when maximum output current becomes less than ILO in magnitude. Float delay is not 100% 
tested. . 
6. The signal is allowed to be asynchronous to CLK2. The setup and hold specifications are given for testing purposes, to 
assure recognition within a specific CLK2 period. ; 

7. The signal is not sampled. It must be valid through the entire cycle (as the Address lines). 

8. When tested as the second 82395DX. 

9. When tested as the third 82395DxX. 
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Figure 9.2 - AC Timing Waveforms - Local Bus 
input Setup and Hold Timing 
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Figure 9.3 - AC Timing Waveforms - System Bus 
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PHI 1 PHI 2jPHI1 PHI 2} PHi 1 


oe 


ALG} Valid NAT 


mir | 


READYO# alid N]_ \NNY Valid N#1 
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. Figure 9.4-AC Timing Waveforms - Output Valid Delay 
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ciK2 |. \S 
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Figure 9.5 - AC Timing Waveforms - Output Float Delays 
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Figure 9.6 - Typical Output Valid Delay vs Load Capacitance 
at Maximum Operating Temperature (C,_ = 50pF) 
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82395DX 
APPENDIX A 

Definition Term 
Alternating Current RAM 
Aborted Line Fill SB 
Cache Directory FLUSH TV 
Cache Directory Lookup WP 
Cache Directory SNOOP xxK 
Testability Access xxKB 
Cache Directory Update xxGB 
Cache Read xWS 
Cache Write T1 
Cache Update T2 
Testability Access Tl 
Central Processing Unit TH 
Complimentary High Performance ST1 
Metal Oxide Semiconductor ST1P 
Cache Read Hit ST2 
Cache Read Miss ST2P 
Cache Write Hit STI 
Direct Current | STH 
Dynamic Random Access Memory PHI1 
Direct Memory Access PHI2 
Double Word C 
Ground V 
Input/Output pA 
Local Bus mA 
Local Bus Access pF 
Least Recently Used MHz 
Plastic Quad Flat Pack ns 


5-546 


ADVANCE INFORMATION 


Definition 
Random Access Memory 
System Bus _ 
Tag Valid 
Write Protect 
xx thousand 
xx K Bytes 
xx Giga Bytes 
xx Wait States 
Local Bus State 
Local Bus State 
Local Bus State 
Local Bus State 
System Bus State 
System Bus State 
System Bus State 
System Bus State 
System Bus State 
System Bus State . 
1st CLK2 cycle in a 2 CLK2 CLK cycle 
2nd CLK2 cycle in a 2 CLK2 CLK cycle 
Celsius 
Volts 
10-6 Amps 
10-3 Amps 
10-12 Farads 
106 Hertz 
10-9 seconds 
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HIGH PERFORMANCE 
32-BIT CACHE CONTROLLER 


= Improves 386™ DX System m Synchronous Dual Bus Architecture 
Performance — Bus Watching Maintains Cache 
— Reduces Average CPU Wait States to Coherency 
Nearly Zero 


— Zero Wait State Read Hit . hy heal Renitatiee ae 
— Zero Wait State Posted Memory 
Writes m Flexible Cache Mapping Policies 
— Allows Other Masters to Access the — Direct Mapped or 2-Way Set 
System Bus More Readily Associative Cache Organization 
m Hit Rates up to 99% Bek Non-Cacheable Memory 


m= Optimized as 386 DX Companion — Unified Cache for Code and Data 


— Simple 386 DX Interface : 
— Part of 386 DX-Based Compute = ils ae at pe pala and Cache 


Engine Including 387™ DX Math 


Coprocessor and 82380 Integrated m High Speed CHMOS* IV Technology 
a ee gaa m 132-Pin PGA Package ~ 
” <OpSRAtION ia Z m@ 132-Lead Plastic Quad Flat Pack (PQFP) 


m= Software Transparent 


The 82385 Cache Controller is a high performance 32-bit peripheral for the Intel886 Microprocessor. It stores 
a copy of frequently accessed code and data from main memory in a zero wait state local cache memory. The 
82385 enables the 386 DX to run at its full potential by reducing the average number of CPU wait states to 
nearly zero. The dual bus architecture of the 82385 allows other masters to access system resources while the 
386 DX operates locally out of its cache. In this situation, the 82385’s ‘‘bus watching” mechanism preserves 
cache coherency by monitoring the system bus address lines at no cost to system or local throughput. 


The 82385 is completely software transparent, protecting the integrity of system software. High performance 
and board savings are achieved because the 82385 integrates a cache directory and all cache management 
logic on one chip. 


386 DX 
ADDRESS BUS 


SNOOP BUS 


82385 LOCAL | 
BUS CONTROL 
BUS 
ARBITRATION 


82385 
LOCAL BUS 
INTERFACE 


CACHE 
DIRECTORY 


INTERNAL 
CONTROL BUS 


386" px LOCAL 
BUS CONTROL 
386 DX LOCAL 
BUS DECODES 


PROCESSOR 
INTERFACE 


CACHE 
CONTROL 


CACHE 
CONTROL BUS 


82385 CONFIGURATION 


ak : 290143-1 
82385 Internal Block Diagram 
*CHMOS is a patented process of Intel Corporation. 
Intel386™, 386™ DX, 387™ DX are trademarks of Intel Corporation. 
October 1990 
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1.0 82385 FUNCTIONAL OVERVIEW 


The 82385 Cache Controller is a high performance 
32-bit peripheral for the Intel386 microprocessor. 
This chapter provides an overview of the 82385, and 
of the basic architecture and operation of an 386 DX 
CPU/82385 system. 


1.1 82385 OVERVIEW 


The main function of a cache memory system is to — 


provide fast local storage for frequently accessed 
code and data. The cache system intercepts 386 DX 
memory references to see if the required data re- 
sides in the cache. If the data resides in the cache (a 
hit), itis returned to the 386 DX without incurring wait 


states. If the data is not cached (a miss), the refer- - 


ence is forwarded to the system and the data re- 
trieved from main memory. An efficient cache will 
yield a high “hit rate’ (the ratio of cache hits to total 
386 DX accesses), such that the majority of access- 
es are serviced with zero wait states. The net effect 
is that the wait states incurred in a relatively infre- 
quent miss are averaged over a large number of ac- 
cesses, resulting in an average of nearly zero wait 
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- States per access. Since cache hits are serviced io- 


cally, a processor operating out of its local cache 
has a much lower “bus utilization” which reduces 
system bus bandwidth requirements, making more 
bandwidth available to other bus masters. 


The 82385 Cache Controiler integrates a cache di- 
rectory and all cache management logic required to 
support an external 32 Kbyte cache. The cache di- 
rectory structure is such that the entire physical ad- 
dress range of the 386 DX (4 Gigabytes) is mapped 
into the cache. Provision is made to allow areas of 
memory to be set aside as non-cacheable. The user 
has two cache organization options: direct mapped 
and 2-way set associative. Both provide the high hit 
rates necessary to make a large, relatively slow 
main memory array look like a fast, zero wait state 

memory to the 386 DX. : 


1.2 SYSTEM OVERVIEW I: | 
BUS STRUCTURE 


A good grasp of the bus structure of a 386 DX CPU/ 
82385 system is essential in understanding both the 
82385 and its role in an 386 DX system. The follow- 


ing is a description of this structure. 
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Figure 1-1. 386 DX System Bus Structure 
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1.2.1 386 DX Local Bus/82385 Local 
Bus/System Bus 


Figure 1-1 depicts the bus structure of a typical 386 
DX system. The “386 DX Local Bus” consists of the 
physical 386 DX address, data, and control busses. 
The local address and data busses are buffered 
and/or latched to become the “system” address 
and data busses. The local control bus is decoded 
by bus control logic to generate the various system 
bus read and write commands. 


The addition of an 82385 Cache Controller causes a 
separation of the 386 DX bus into two distinct bus- 
ses: the actual 386 DX local bus and the ‘82385 
Local Bus” (Figure 1-2). The 82385 local bus is de- 
signed to look like the front end of an 386 DX by 
providing 82385 local bus equivalents to all appropri- 
ate 386 DX signals. The system ties to this “386 DX- 
like’’ front end just as it would to an actual 386 DX. 
The 386 DX simply sees a fast system bus, and the 
system sees a 386 DX front end with low bus band- 
width requirements. The cache subsystem is trans- 
parent to both. Note that the 82385 local bus is not 
simply a buffered version of the 386 DX bus, but 
rather is distinct from, and able to operate in parallel 
with the 386 DX bus. Other masters residing on ei- 
ther the 82385 local bus or system bus are free to 
manage system resources while the 386 DX oper- 
ates out of its cache. : 


TA 


BUFFER 
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1.2.2 Bus Arbitration 


The 82385 presents the “386 DX-like’ interface 
which is called the 82385 local bus. Whereas the 
386 DX provides a Hold Request/Hold Acknowl- 
edge bus arbitration mechanism via its HOLD and 
HLDA pins, the 82385 provides an equivalent mech- 
anism via its BHOLD and BHLDA pins. (These sig- 
nals are described in Section 3.7.) When another 
master requests the 82385 local bus, it issues the 
request to the 82385 via BHOLD. Typically, at the 
end of the current 82385 local bus cycle, the 82385 
will release the 82385 local bus and acknowledge 
the request via BHLDA. The 386 DX is of course free 
to continue operating on the 386 DX local bus while 
another master owns the 82385 local bus. 


1.2.3 Master/Slave Operation 


The above 82385 local bus arbitration discussion is 
true when the 82385 is programmed for “Master” 
mode operation. The user can, however, configure 
the 82385 for “Slave’’ mode operation. (Program- 
ming is done via a hardware strap option.) The roles 
of BHOLD and BHLDA are reversed for an 82385 in 
slave mode; BHOLD is now an output indicating a 
request to control the bus, and BHLDA is an input 
indicating that a request has been granted. An 
82385 programmed in slave mode drives the 82385 
local bus only when it has requested.and subse- 
quently been granted bus control. This allows multi- 
ple 386 DX CPU/82385 subsystems to reside on the 
same 82385 local bus (Figure 1-3). 
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Figure 1-2. 386™ DX CPU/82385 System Bus Structure 
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Figure 1-3. Multi-Master/ Multi-Cache Environment . 


1.2.4 Cache Coherency 


Ideally, a cache contains a copy of the most heavily 
used portions of main memory. To maintain cache 
“coherency” is to make sure that this local copy is 
identical to main memory. In a system where multi- 
ple masters can access the same memory, there is 
_ always a risk that one master will alter the contents 
of a memory location that is duplicated in the local 
cache of another master. (The cache is said to con- 
tain “stale” data.) One rather restrictive solution is to 
not allow cache subsystems to cache shared memo- 


ry. Another simple solution is to flush the cache any- _ 


time another master writes to system memory. How- 
ever, this can seriously degrade system perform- 


ance as excessive cache flushing will reduce the hit _ 


CACHE c= 82385 | 
DATA 
BUFFER 


SNOOP BUS 


-SYSTEM ADDRESS BUS 
-WRITE CYCLE INDICATOR 


290143-4 


rate of what may onenyiee Bac a highly efficient 
cache. 


The 82385 preserves cache coherency via “bus 
watching” (also called snooping), a technique that 
neither impacts performance nor restricts memory 
mapping. An 82385 that is not currently bus master 
monitors system bus cycles, and when a write cycle 
by another master is detected (a snoop), the system 
address is sampled and used to see if the refer- 
enced location is duplicated in the cache. If so (a 
snoop hit), the corresponding cache entry is invali- 
dated, which will force the 386 DX to fetch the up-to- 


date data from main memory the next time it access- 


es this modified location. Figure 1-4 depicts the gen- 
eral form of bus watching. 
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1.3 SYSTEM OVERVIEW Il: 
BASIC OPERATION 


This discussion is an overview of the basic operation 
of an 386 DX CPU/82385 system. Items discussed 
include the 82385’s response to all 386 DX cycles, 
including interrupt acknowledges, halts, and shut- 
downs. Also discussed are non-cacheable and local 
accesses. 


1.3.1 386 DX Memory Code and Data 
Read Cycles 


1.3.1.1 READ HITS 


When the 386 DX initiates a memory code or data 
read cycle, the 82385 compares the high order bits 
of the 386 DX address bus with the appropriate ad- 
dresses (tags) stored in its on-chip directory. (The 
directory structure is described in Chapter 2.) If the 
82385 determines that the requested data is in the 
cache, it issues the appropriate control signals that 
_ direct the cache to drive the requested data onto the 
386 DX data bus, where it is read by the 386 DX. 
The 82385 terminates the 386 DX cycle without in- 
serting any wait states. 


1.3.1.2 READ MISSES 


If the 82385 determines that the requested data is 
not in the cache, the request is forwarded to the 
82385 local bus and the data retrieved from main 
memory. As the data returns from main memory, it is 
directed to the 386 DX and also written into the 
cache. Concurrently, the 82385 updates the cache 
directory such that the next time this particular piece 
of information is requested by the 386 DX, the 
82385 will find it in the cache and return it with zero 
wait states. 


The basic unit of transfer between main memory and 
cache memory in a cache subsystem is called the 
line size. In an 82385 system, the line size is one 32- 
bit aligned doubleword. During a read miss, all four 
82385 local bus byte enables are active. This en- 
sures that a full 32-bit entry is written into the cache. 
(The 386 DX simply ignores what it did not request.) 
In any other type of 386 DX cycle that is forwarded 
to the 82385 local bus, the logic levels of the 386 DX 
byte enables are duplicated on the 82385 local bus. 


The 82385 does not actively fetch main memory 
data independently of the 386 DX. The 82385 is es- 
sentially a passive device which only monitors the 
address bus and activates control signals. The read 
miss is the only mechanism by which main memory 
data is copied into the cache and validated in the 
cache directory. 


82385 


In-an isolated read miss, the number of wait states 
seen by the 386 DX is that required by the system 
memory to respond with data plus the cache com- 
parison cycle (hit/miss decision). The cache system 
must determine that the cycle is a miss before it can 
begin the system memory access. However, since 
misses most often occur consecutively, the 82385 
will begin 386 DX address pipelined cycles to effec- 
tively “hide” the comparison cycle beyond the first 
miss (refer to Section 4.1.3). 


The 82385 can execute a main memory access on 
the 82385 local bus only if it currently owns the bus. 
If not, an 82385 in master mode will run the cycle 
after the current master releases the bus. An 82385 
in slave mode will issue a hold request, and will run 
the cycle as soon as the request is acknowledged. 
(This is true for any read or write cycle that needs to 
run.on the 82385 local bus.) - 


1.3.2 386 DX Memory Write Cycles 


The 82385’s “posted write’ capability allows the 
majority of 386 DX memory write cycles to run with 
zero wait states. The primary memory update policy 
implemented in a posted write is the traditional 


cache “write through” technique, which implies that 


main memory is always updated in any memory write 
cycle. If the referenced location also happens to re- 
side in the cache (a write hit), the cache is updated 
as well. 


Beyond this, a posted write latches the 386 DX ad- 
dress, data, and cycle definition signals, and the 386 
DX local bus cycle is terminated without any wait 
states, even though the corresponding 82385 local 
bus cycle is not yet completed, or perhaps not even 
started. A posted write is possible because the 
82385’s bus state machine, which is almost identical 
to the 386 DX bus state machine, is able to run 
82385 local bus cycles independently of the 386 DX. 
The only time the 386 DX sees write cycle wait 
states is when a previously latched (posted) write 
has not yet been completed on the 82385 local bus 
or during an |/O write (which is not posted). A 386 
DX write can be posted even if the 82385 does not 
currently own the 82385 local bus. In this case, an. 
82385 in master mode will run the cycle as soon as 
the current master releases the bus, and an 82385 
in slave mode will request the bus and run the cycle 
when the request is acknowledged. The 386 DX is 
free to continue operating out of its cache (on the 
386 DX local bus) during this time. 


1.3.3 Non-Cacheable Cycles 


Non-cacheable ‘cycles fall into one of two catego- 
ries: cycles decoded as non-cacheable, and cycles 


5-555 


intel 


that are by default non-cacheable according to the - 


82385’s design. All non-cacheable cycles are for- 
warded to the 82385 local bus. Non-cacheable cy- 
cles have no effect on the cache or cache directory. 


The 82385 allows the system designer to define ar- 


eas of main memory as non-cacheable. The 386 DX — 


address bus is decoded and the decode output is 
connected to the 82385’s non-cacheable access 
(NCA #) input. This decoding is done in the first 386 
DX bus state in which the non-cacheable cycle ad- 
dress becomes available. Non-cacheable read cy- 
cles resemble cacheable read miss cycles, except 


that the cache and cache directory are unaffected. : 


NCA defined non-cacheable writes, like most writes, 
- are posted. 


The 82385 defines certain cycles as non-cacheable 


without using its non-cacheable access input. These ~ 


‘include I/O cycles, interrupt acknowledge cycles, 
and halt/shutdown cycles. |/O reads and interrupt 
acknowledge cycles execute as any other non- 
cacheable read. I/O write cycles are not posted. The 
386 DX is not allowed to continue until a ready signal 
is returned from the system. Halt/Shutdown cycles 
are posted. During a halt/shutdown condition, the 
82385 local bus duplicates the behavior of the 386 
DX, including the ability to recognize and respond to 
a BHOLD request. (The 82385’s bus watching 
mechanism is functional in this condition.) 7 


1.3.3.1 16-BIT MEMORY SPACE 


The 82385 does not cache 16-bit memory space (as 
decoded by the 386 DX BS16# input), but does 
make provisions to handle 16-bit space as non- 
cacheable. (There is no 82385 equivalent to the 386 
DX BS16# input.) In a system without an 82385, the 
386 DX BS16# input need not be asserted until the 
last state of a 16-bit cycle for the 386 DX to recog- 
nize itas such (unless NA# is sampled active earlier 
in the cycle.) The 82385, however, needs this infor- 
mation earlier, specifically at the end of the first 386 
DX bus state in which the address of the 16-bit cycle 
becomes available. The result is that in a system 
without an 82385, 16-bit devices can inform the 386 
DX that they are 16-bit devices “‘on the fly,”’ while in 


82385 


a system with an 82385, devices decoded as 16-bit 


(using the 386 DX BS16#) must be located in ad- 
dress space set aside for 16-bit devices. If 16-bit 
space is decoded according to 82385 guidelines (as 
described later in the data sheet), then the 82385 
will handle 16-bit cycles just like the 386 DX does, 
including effectively locking the two halves of a non- 
aligned 16-bit transfer from interruption by another 

master. | 


1.3.4 386 DX Local Bus Cycles | 


386 DX Local Bus Cycles are accesses to resources 
on the 386 DX local bus other than to the 82385 


itself. The 82385 simply ignores these accesses: 


they are neither forwarded to the system nor do they 
affect the cache. The designer sets aside memory 
and/or |/O space for local resources by decoding 
the 386 DX address bus and feeding the decode to 
the 82385’s local bus access (LBA #) input. The de- 
signer can also decode the 386 DX cycle definition 
signals to keep specific 386 DX cycles from being 
forwarded to the system. For example, ‘a multi-proc- 
essor design may wish to capture and remedy a 386 
DX shutdown locally without having it detected by 
the rest of the system. Note that in such a design, 
the local shutdown cycle must be terminated by lo- 
cal bus control logic. The 387 Math Coprocessor is 
considered a 386 DX local bus resource, but it need 
not be decoded as such by the user since the 82385 
is able to internally recognize 387 accesses via the 
M/IO# and A31 ene. 


1.3.5 Summary of 82385 Response to 
All 386 DX Cycles 


Table 1-1 summarizes the 82385 response to all 386 


DX bus cycles, as conditioned by whether or not the 
cycle is decoded as local or non-cacheable. The ta- 
ble describes the impact of each cycle on the cache 
and on the cache directory, and whether or not the - 
cycle is forwarded to the 82385 local bus. Whenever 
the 82385 local bus is marked “IDLE’”’, it implies that 
this bus is available to other masters. — 
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Table 1-1. 82385 Response to 386 DX Cycles 


82385 Response 82385 Response 82385 Response when 
386 DX Bus Cycle when Decoded when Decoded Decoded as an 386 DX 
Definition as Cacheable as Non-Cacheable Local Bus Access 


[ese[ i [te Fe 
Cycle oo Local Bus 


: CODE 


Cache 82385 Guche Cache 82385 
Directory Local Bus Directory | Local Bus 
UNDEFINED += IDLE 


a 


ea Nee 
mote (Ca (eco ea 


MEM CODE 


iss CACHE! DATA ee CODE READ 
| WRITE | VALIDATION] — READ 
5 ; HALT/ HALT/ 
SHUTDOWN SHUTDOWN | 


| CACHE 
MEM DATA READ ee 
ae READ 
a wise | CACHE DATA | MEMDATA 
| WRITE |} VALIDATION] READ 


| MEM DATA | WRITE WRITE 
: : : WRITE 
MEM DATA | 
= |r| 
NOTES: 


@ A dash (—) indicates that the cache and cache directory are unaffected. This table does not reflect how an access affects the LRU bit. 
@ An “IDLE” 82385 Local Bus implies that this bus is available to other masters. 
@ The 82385’s response to 80387 accesses is the same as when decoded as an 386 DX Local Bus access. 
* The only other operations that affect the cache directory are: 
1. RESET or Cache Flush—aill tag valid bits cleared. 
2. Snoop Hit—corresponding line valid bit cleared. 
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1.3.6 Bus Watching 


As previously discussed, the 82385 ‘qualifies’ an 
386 DX bus cycle in the first bus state in which the 
address and cycle definition signals of the cycle be- 
come available. The cycle is qualified as read or 
write, cacheable or non-cacheable, etc. Cacheable 
cycles are further classified as hit or miss according 
to the results of the cache comparison, which ac- 
cesses the 82385 directory and compares .the ap- 
propriate directory location (tag) to the current 386 
DX address. If the cycle turns out to be non-cache- 
able or a 386 DX local bus access, the hit/miss deci- 
sion is ignored. The cycle qualification requires one 
386 DX state. Since the fastest 386 DX access is 
two states, the second state can be used for bus 
watching. 


When the 82385 does not own the system bus, it 
monitors system bus cycles. If another master writes 
into main memory, the 82385 latches the system ad- 
dress and executes a cache look-up to see if the 
altered main memory location resides in the cache. 
If so (a snoop hit), the cache entry is marked invalid 
in the cache directory. Since the directory is at most 
only being used every other state to qualify 386 DX 
accesses, snoop look-ups are interleaved between 
386 DX local bus look-ups. The cache directory is 
time multiplexed between the 386 DX address and 
the latched system address. The result is that all 
snoops are caught and serviced without slowing 
down the 386 DX, even when running zero wait state 
hits on the 386 DX local bus. 


1.3.7 Cache Flush 


_ The 82385 offers a cache flush input. When activat- 
ed, this signal causes the 82385 to invalidate all 


data which had previously been cached. Specifically, — 
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all tag valid bits are cleared. (Refer to the 82385 
directory structure in Chapter 2.) Therefore, the 
cache is empty and subsequent cycles are misses 
until the 386 DX begins repeating the new accesses 
(hits). The primary use of the FLUSH input is for di- 
agnostics and multi-processor support. 


| NOTE: 
The use of this pin as a coherency mechanism may 
impact software transparency. 


2.0 82385 CACHE ORGANIZATION 


The 82385 supports two cache organizations: a sim- 
ple direct mapped organization and a slightly more 
complex, higher performance two way set associa- . 
tive organization. The choice is made by strapping 
an 82385 input (2W/D#) either high or low. This 
chapter describes the structure and operation of 
both organizations. 


2.14 DIRECT MAPPED CACHE 


2.1.1 Direct Mapped Cache Structure 
and Terminology 


Figure 2-1 depicts the relationship between the 
82385’s internal cache directory, the external cache 
memory, and the 386 DX’s 4 Gigabyte physical ad- | 
dress space. The 4 Gigabytes can conceptually be 
thought of as cache “‘pages” each being 8K double- 
words (32 Kbytes) deep. The page size matches the 
cache size. The cache can be further divided into 
1024 (0 thru 1023) sets of eight doublewords (8 x 32 
bits). Each 32-bit doubleword is called a “line.” The 
unit of transfer between the main memory and 
cache is one line. 
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Figure 2-1. Direct Mapped Cache Organization 
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Each block in the external cache has an associated 
26-bit entry in the 82385’s internal cache directory. 
This entry has three components: a 17-bit ‘‘tag,” a 
“tag valid’’ bit, and eight “‘line valid” bits. The tag 
acts aS a main memory page number (17 tag bits 
support 217 pages). For example, if line 9 of page 2 
currently resides in the cache, then a binary 2 is 
stored in the Set 1 tag field. (For any 82385 direct 
mapped cache page in main memory, Set 0 consists 
of lines O—7, Set 1 consists of lines 8—15, etc. Line 9 
is shaded in Figure 2-1.) An important characteristic 
of a direct mapped cache is that line 9 of any page 
can only reside in line 9 of the cache. All identical 
page offsets map to a single cache location. 


The data in a cache set is considered valid or invalid 
depending on the status of its tag valid bit. If clear, 
the entire set is considered invalid. If true, an individ- 
ual line within the set is considered valid or invalid 
depending on the status of its line valid bit. 


The 82385 sees the 386 DX address bus (A2-A31) 
as partitioned into three fields: a 17-bit ‘“‘tag’’ field 
(A15-A31), a 10-bit ‘“‘set-address” field (A5—-A14), 
and a 3-bit “‘line select” field (A2—A4). (See Figure 
2-2.) The lower 13 address bits (A2—A14) also serve 
as the “cache address” which directly selects one 
of 8K doublewords in the external cache. 


2.1.2 Direct Mapped Cache Operation 
The following is a description of the interaction be- 
tween the 386 DX, cache, and cache directory. 
2.1.2.1 READ HITS 


When the 386 DxX initiates a memory read cycle, the 
82385 uses the 10-bit set address to select one of 
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1024 directory entries, and the 3-bit line select field 
to select one of eight line valid bits within the entry. 
The 13-bit cache address selects the corresponding 
doubleword in the cache. The 82385 compares the 
17-bit tag field (A15-—A31 of the 386 DX access) with 
the tag stored in the selected directory entry. If the 
tag and upper address bits match, and if both the 
tag and appropriate line valid bits are set, the result 
is a hit, and the 82385 directs the cache to drive the 
selected doubleword onto the 386 DX data bus. A 
read hit does not alter the contents of the cache or 
directory. 


2.1.2.2 READ MISSES 


A read miss can occur in two ways. The first is 
known as a “line” miss, and occurs when the tag 
and upper address bits match and the tag valid bit is 
set, but the line valid bit is clear. The second is 
called a “tag” miss, and occurs when either the tag 
and upper address bits do not match, or the tag valid 
bit is clear. (The line valid bit is a ‘don’t care” in a 
tag miss.) In both cases, the 82385 forwards the 386 
DX reference to the system, and as the returning 
data is fed to the 386 Dx, it is written into the cache 
and validated in the cache directory. 


In a line miss, the incoming data is validated simply 
by setting the previously clear line valid bit. In a tag 
miss, the upper address bits overwrite the previously 
stored tag, the tag valid bit is set, the appropriate 
line valid bit is set, and the other seven line valid bits 
are cleared. Subsequent tag hits with line misses will 
only set the appropriate line valid bit. (Any data as- 
sociated with the previous tag is no longer consid- 
ered resident in the cache.) 
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Figure 2-2. 386 DX Address Bus Bit Fields—Direct Mapped Organization 
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2.1.2.3 OTHER OPERATIONS THAT AFFECT | 
THE CACHE AND CACHE DIRECTORY 


The other operations that affect the cache and/or 
directory are write hits, snoop hits, cache flushes, 
and 82385 resets. In a’write hit, the cache is updat- 
ed along with main memory, but the directory is un- 
affected. In a snoop hit, the cache is unaffected, but 
the affected line is invalidated by clearing its line 
valid bit in the directory. Both an 82385 reset and 
cache flush clear all tag valid bits. 


When an 386 DX CPU/82385 system “wakes up” 
upon reset, all tag valid bits are clear. At this point, a 
read miss is the only mechanism by which main 
memory data is copied into the cache and validated 
in the. cache directory. Assume an early 386 DX 
code access seeks (for the first time) line 9 of page 
2. Since the tag valid bit is clear, the access is a tag 
miss, and the data is fetched from main memory. 
Upon return, the data is fed to. the 386 DX and simul- 
taneously written into line 9 of the cache. The set 
directory entry is updated to show this line as valid. 
Specifically, the tag and appropriate line valid bits 
are set, the remaining seven line valid bits cleared, 
and a binary 2 written into the tag. Since code is 
sequential in nature, the 386 DX will likely next want 
line 10 of page 2, then line 11, and so on. If the 386 
~ DX sequentially fetches the next six lines, these 
fetches will be line misses, and as each is fetched 
from main memory and written into the cache, its 
corresponding line valid bit is set. This is the basic 
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flow of events that fills the cache with valid data. 
Only after a piece of data has been copied into the 
cache and validated can it be accessed in a zero 
wait state read hit. Also, a cache entry must have 
been validated before it can be subsequently altered 
by a write hit, or invalidated by a snoop hit. 


An extreme example of “thrashing” is if line 9 of 
page two is an instruction to jump to line 9 of page 
one, which is an instruction to jump back to line 9 of 
page two. Thrashing results. from the direct mapped 
cache characteristic that all identical page offsets 
map to a single cache location. In this example, the 
page one access overwrites the cached page two 
data, and the page two access overwrites the cach- 
ed page one data. As long as the code jumps back 
and forth the hit rate is zero. This is of course an 
extreme case. The effect of thrashing is that a direct 
mapped cache exhibits a slightly reduced overall hit 
rate as compared to a set associative cache of the 
same size. | 


2.2 TWO WAY SET ASSOCIATIVE 
CACHE 


2.2.1 Two Way Set Associative Cache 
Structure and Terminology 


Figure 2-3 illustrates the relationship between the 
directory, cache, and 4 Gigabyte address space. 
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Figure 2-3. Two-Way Set Associative Cache Organization 
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Whereas the direct mapped cache is organized as 
one bank of 8K doublewords, the two way set asso- 
ciative cache is organized as two banks (A and B) of 
4K doublewords each. The page size is halved, and 
the number of pages doubled. (Note the extra tag 
bit.) The cache now has 512 sets in each bank. (Two 
banks times 512 sets gives a total of 1024. The 
structure can be thought of as two. half-sized direct 
mapped caches in parallel.) The performance ad- 
vantage over a direct mapped cache is that all iden- 
tical page offsets map to two cache locations in- 
stead of one, reducing the potential for thrashing. 
The 82385’s partitioning of the 386 DX address bus 
is depicted in Figure 2-4. : 


2.2.2 LRU Replacement Algorithm 


The two way set associative directory has an addi- 
tional feature: the “least recently used” or LRU bit. 
In the event of a read miss, either bank A or bank B 
will be updated with new data. The LRU bit flags the 
candidate for replacement. Statistically, of two 
blocks of data, the block most recently used is the 
block most likely to be needed again in the near 
future. By flagging the least recently used block, the 
82385 ensures that the cache block replaced is the 
least likely to have data needed by the CPU. 


2.2.3 Two Way Set Associative 
Cache Operation 


2.2.3.1 READ HITS 


When the 386 Dx initiates a memory read cycle, the 
82385 uses the 9-bit set address to select one of 
512 sets. The two tags of this set are simultaneously 
compared with A14-A31, both tag valid bits 
checked, and both appropriate line valid bits 
checked. If either comparison produces a hit, the 
corresponding cache bank is directed to drive the 
selected doubleword onto the 386 DX data bus. 
(Note that both banks will never concurrently cache 
the same main memory location.) If the requested 
data resides in bank A, the LRU bit is pointed toward 
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B. If B produces the hit, the LRU bit is pointed 
toward A. 


2.2.3.2 READ MISSES 


As in direct mapped operation, a read miss can be 
either a line or tag miss. Let’s start with a tag miss 
example. Assume the 386 DX seeks line 9 of page 2, 
and that neither the A or B directory produces a tag 
match. Assume also, as indicated in Figure 2-3, that 
the LRU bit points to A. As the data returns from 
main memory, it is loaded into offset 9 of bank A. 
Concurrently, this data is validated by updating the 
set 1 directory entry for bank A. Specifically, the up- 
per address bits overwrite the previous tag, the tag 
valid bit is set, the appropriate line valid bit is set, 
and the other seven line valid bits cleared. Since this 
data is the most recently used, the LRU bit is turned 
toward B. No change to bank B occurs. 


If the next 386 DX request is line 10 of page two, the 
result will be a line miss. As the data returns from 
main memory, it will be written into offset 10 of bank 
A (tag hit/line miss in bank A), and the appropriate 
line valid bit will be set. A line miss in one bank will 
cause the LRU bit to point to the other bank. In this 
example, however, the LRU bit has already been 
turned toward B. 


2.2.3.3 OTHER OPERATIONS THAT AFFECT | 
THE CACHE AND CACHE DIRECTORY 


Other operations that affect the cache and cache 
directory are write hits, snoop hits, cache flushes, 
and 82385 resets. A write hit updates the cache 
along with main memory. If directory A detects the 
hit, bank A is updated. If directory B detects the hit, 
bank B is updated. If one bank is updated, the LRU 
bit is pointed toward the other. 


If a snoop hit invalidates an entry, for example, in 
cache bank A, the corresponding LRU bit is pointed 
toward A. This ensures that invalid data is the prime 
candidate for replacement in a read miss. Finally, 
resets and flushes behave just as they do in a direct 
mapped cache, clearing all tag valid bits. 
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3.0 82385 PIN DESCRIPTION | 


The 82385 creates the 82385 local bus, which is a © 


functional 386 DX interface. To facilitate under- 
standing, 82385 local bus signals go by the same 
~ name as their 386 DX equivalents, except that they 
are preceded by the letter “B”. The 82385 local bus 
equivalent to ADS# is BADS#, the equivalent to 
NA# is BNA#, etc. This convention applies to bus 
states as well. For example, BT1P is the 82385 local 
bus state eauivetent to the 386 DX T1P state. 


3.1 386 DX CPU/82385 INTERFACE 
SIGNALS 


These signals form the direct interface between the 
386 DX and 82385. 


3.1.1 386 DX CPU/82385 Clock (CLK2) 


CLK2 provides the fundamental timing for an 386 DX 
CPU/82385 system, and is driven by the same 
source that drives the 386 DX CLK2 input. The 
82385, like the 386 DX, divides CLK2 by two to gen- 
erate an internal “phase indication” clock. (See Fig- 
ure 3-1.) The CLK2 period whose rising edge drives 
the internal clock low is called PHI1, and the CLK2 
period that drives the internal clock high is called 
_.. PHI2. A PHI1-PHI2 combination (in that order) is 
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known as a “T” state, and is the basis for 386 DX 
pus cies 


3.1. 2 386 DX CPU/ jena Reset 


(RESET) 


This input resets the 82385, bringing it to an initial 
known state, and is driven by the same source that 
drives the 386 DX RESET input. A reset effectively 
flushes the cache by clearing all cache directory tag 
valid bits. The falling edge of RESET is synchronized 
to CLK2, and used by the 82385 to properly estab- 
lish the phase of its internal clock. (See Figure 3-2.) 
Specifically, the second internal phase following the 


_ falling edge of RESET is PHI2. 


3.1.3 386 DX CPU/82385 Address Bus 
(A2-A31), Byte Enables — 
'(BEO # -BE3 #), and Cycle 
Definition Signals (M/IO#, 

D/C #, W/R#, LOCK #) 


The 82385 directly connects to these 386 DX out- 
puts. The 386 DX address bus is used in the cache 
directory comparison to see if data referenced by 
386 DX resides in the cache, and the byte enables 
inform the 82385 as to which portions of the data 
bus are involved in an 386 DX cycle. The cycle defi- 
nition signals are decoded by the 82385 to deter- 
mine the type of cycle the 386 DX is executing. | 
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3.1.4 386 DX CPU/82385 Address 
Status (ADS #) and Ready Input 
(READYI#) 


ADS#, a 386 DX output, tells the 82385 that new 
address and cycle definition information is available. 
READYI#, an input to both the 386 DX (via the 386 
DX READY # input pin) and 82385, indicates the 
completion of an 386 DX bus cycle. ADS# and 
READYI# are used to keep track of the 386 DX bus 
state. 


3.1.5 386 DX Next Address cee 
(NA#) 


This 82385 output controls 386 DX pipelining. It can 

be tied directly to the 386 DX NA# input, or it can be 

logically “AND” ed with other 386 DX local bus next 
address requests. 


3.1.6 Ready Output (READYO #) and 
Bus Ready Enable (BRDYEN#) 


The 82385 directly terminates all but two types of 
386 DX bus cycles with its READYO# output. 386 
DX local bus cycles must be terminated by the local 
device being accessed. This includes devices de- 
coded using the 82385 LBA# signal and 80387 ac- 
cesses. The other cycles not directly terminated by 


the 82385 are 82385 local bus reads, specifically 


cache read misses and non-cacheable reads. (Re- 
call that the 82385 forwards and runs such cycles on 
the 82385 bus.) In these cycles the signal that termi- 
nates the 82385 local bus access is BREADY#, 
which is gated through to the 386 DX local bus such 
that the 386 DX and 82385 local bus cycles are con- 
currently terminated. BRDYEN # is used to gate the 
BREADY # signal to the 386 DX. 


3.2 CACHE CONTROL SIGNALS 


These 82385 outputs control the external 32 KB 
cache data memory. 


3.2.1 Cache Address Latch Enable 
(CALEN) 


This signal controls the latch (typically an F or AS 
series 74373) that resides between the low order 
386 DX address bits and the cache SRAM address 
inputs. (The outputs of this latch are the “cache ad- 
dress” described in the previous chapter.) When 
CALEN is high the latch is transparent. The falling 
edge of CALEN latches the current inputs which re- 
main applied to the cache data memory until CALEN 
returns to an active high state. 
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3.2.2 Cache Transmit/Receive 
(CT/R#) 


This signal defines the direction of an optional data 


transceiver (typically an F or AS series 74245) be- 
tween the cache and 386 DX data bus. When high, 
the transceiver is pointed towards the 386 DX local 
data bus (the SRAMs are output enabled). When 
low, the transceiver points towards the cache data 
memory. A transceiver is required if the cache is de- 
signed with SRAMs that lack an output enable con- 
trol. A transceiver may also be desirable in a system 
that has a heavily loaded 386 DX local data bus. 
These devices are not necessary when using 
SRAMs which incorporate an output enable. 


3.2.3 Cache Chip Selects 
(CSO # -CS3#) 


‘These active low signals tie to the cache SRAM chip 


selects, and individually enable the four bytes of the 
32-bit wide cache. CSO# enables DO-D7, CS1# 
enables D8-D15, CS2# enables D16-D23, and 
CS3# enables D24-D31. During read hits, all four 
bytes are enabled regardless of whether or not all 
four 386 DX byte enables are active. (The 386 DX 
ignores what it did not request.) Also, all four cache 
bytes are enabled in a read miss so as to update the 
cache with a complete line (double word). In a write 
hit, only those cache bytes that correspond to active 
byte enables are selected. This prevents cache data 
from being corrupted in a partial doubleword write. 


3.2.4 Cache Output Enables 
(COEA#, COEB#) and Write 
Enables (CWEA#, CWEB #) 


COEA# and COEB# are active low signals which 
tie to the cache SRAM or Transceiver output en- 
ables and respectively enable cache bank A or B. 
The state of DEFOE # (define cache output enable), 
an 82385 configuration input, determines the func- 
tional definition of COEA# and COEB#. 


lf DEFOE# = Vj, in a two-way set associative 
cache, either COEA# or COEB# is active during 
read hit cycles only, depending on which bank is 
selected. In a direct mapped cache, both are activat- 
ed during read hits, so the designer is free to use 
either one. This COEx# definition best suites cache 
SRAMs with output enables. 


lf DEFOE# = Viy, COEx# is active during read hit, 
read miss (cache update) and write hit cycles only. 
This COEx# definition suites cache SRAMs without 
output enables. In such systems, transceivers are 
needed and their output enables must be active for 
writing, as well as reading, the cache SRAMs. 
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_ CWEA# and CWEB# are active low signals which 
tie to the cache SRAM write enables, and respec- 


tively enable cache bank A or B to receive data from. 


the 386 DX data bus (386 DX write hit or read miss 
update). In a two-way set associative cache, one or 
the other is enabled in a read miss or write hit. Ina 


direct mapped cache, both are activated, so the de- 


signer is free to use either one. 


The various cache configurations supported by the 
82385 are described in Chapter 4. 


3.3 386 DX LOCAL BUS DECODE 
INPUTS — 


These 82385 inputs are generated by decoding the 
386 DX address and cycle definition lines. These ac- 
tive low inputs are sampled at the end of the first 


state in which the address of a new 386 DX cycle 


becomes available (T1 or first T2P). 


3.3.1 386 DX Local Bus Access. 
(LBA#) © 


This input identifies an 386 DX access. as directed to 
a resource (other than the cache) on the 386 DX 
local bus. (The 387 Numerics Coprocessor is con- 
sidered a 386 DX local bus resource, but LBA# 
need not be generated as the 82385 internally de- 
codes 387 accesses.) The 82385 simply ignores 
these cycles. They are neither forwarded to the sys- 
tem nor do they affect the cache or cache directory. 
Note that LBA# has priority over all other types of 
cycles. If LBA# is asserted, the cycle is interpreted 
as an 386 DX local bus access, regardless of the 
cycle type or status of NCA# or X16#. This allows 
any 386 DX cycle (memory, |/O, interrupt acknowl- 
edge, etc.) to be kept on the 386 local bus if desired. 


3.3.2 Non-Cacheable Access (NCA #) 


This active low input identifies a 386 DX cycle as 
non-cacheable. The 82385 forwards non-cacheable 
cycles to the 82385 local bus and runs them. The 
cache and cache directory are unaffected. 


NCA# allows a designer to set aside a portion of 
main memory as non-cacheable. Potential applica- 


tions include memory-mapped I/O and systems | 


where multiple masters access dual ported memory 
via different busses. Another possibility makes use 
of the 386 DX D/C# output. The 82385 by default 
implements a unified code and data cache, but driv- 
ing NCA# directly by D/C# creates a data only 
cache. If D/C # is inverted first, the result is a code 
only cache. Pe Se 
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3.3.3 16-Bit Access (X16 #) 


X16# is an active low input which identifies 16-bit © 
memory and/or I/O space, and the decoded signal 
that drives X16# should also drive the 386 DX 
BS16# input. 16-bit accesses are treated like non- 
cacheable accesses: they are forwarded to and exe- 
cuted on the 82385 local bus with no impact on the 
cache or cache directory. In addition, the 82385 
locks the two halves of a non-aligned 16-bit transfer 


from interruption by another master, as does the 386 


DX. 


3.4 82385 LOCAL BUS INTERFACE. 
SIGNALS 


The 82385 presents a “386 DX-like” front end to the 


system, and the signals discussed in this section are 
82385 local bus equivalents to actual 386 DX sig- 
nals. These signals are named with respect to their 
386 DX counterparts, but with the letter “B”’ append- 
ed to the non | 


Note that the 82385 itself does not have equivalent 
output signals to the 386 DX data bus (DO-D31), 
address bus (A2-—A31), and cycle definition signals 
(M/IO#, D/C#, W/R#). The 82385.data bus (BDO- 
BD31) is actually the system side of a latching trans- 
ceiver, and the 82385 address bus and cycle defini- 
tion signals (BA2-BA31, BM/IO#, BD/C#, 
BW/R #) are the outputs of an edge-triggered latch. 
The signals that control this data transceiver and ad- 
dress latch are discussed in Section 3.5. 


3.4.1 82385 Bus Byte Enables 
(BBEO # -BBE3#) 


BBEO#-BBE3# are the 82385 local bus equiva- 
lents to the 386 DX byte enables. In a cache read 
miss, the 82385 drives all four signals low, regard- 
less of whether or not all four 386 DX byte enables 
are active. This ensures that a complete line (dou- 
bleword) is fetched from main memory for the cache 
update. In all other 82385 local bus cycles, the 
82385 duplicates the logic levels of the 386 DX byte 
enables. The 82385 tri-states these outputs when it 
is not the current bus master. 


3.4.2 82385 Bus Lock (BLOCK #) 


BLOCK# is the 82385 local bus equivalent to the 
386 DX LOCK # output, and distinguishes between 
locked and unlocked cycles. When the 386 DX runs 
a locked sequence of cycles (and LBA # is negated), 
the 82385 forwards and runs the sequence on the 
82385 local bus, regardless of whether any locations 
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referenced in the sequence reside in the cache. A 
read hit will be run as if it is a read miss, but a write 
hit will update the cache as well as being completed 
to system memory. In keeping with 386 DX behavior, 
the 82385 does not allow another master to interrupt 
the sequence. BLOCK# is tri-stated when the 
82385 is not the current bus master. 


3.4.3 82385 Bus Address Status 
(BADS#) 


BADS # is the 82385 local bus equivalent of ADS#, 
and indicates that a valid address (BA2-BA31, 
BBEO#—BBE3#) and cycle definition (BM/lO#, 
BW/R#, BD/C#) is available. It is asserted in BT1 
and BT2P states, and is tri- stated when the 82385 
does not own the bus. 


3.4.4 82385 Bus Ready Input 
(BREADY #) 


82385 local bus cycles are terminated by 
BREADY #, just as 386 DX cycles are terminated by 
the 386 DX READY # input. In 82385 local bus read 
cycles, BREADY # is gated by BRDYEN# onto the 
386 DX local bus, such that it terminates. both the 
386 DX and 82385 local bus cycles. 


3.4.5 82385 Bus Next Address 
Request (BNA #) 


BNA# is the 82385 local bus equivalent to the 386 
DX NA# input, and indicates that the system is pre- 
pared to accept a pipelined address and cycle defi- 
nition. If BNA# is asserted and the new cycle infor- 
mation is available, the 82385 begins a pipelined cy- 
cle on the 82385 local bus. 


3.5 82385 BUS DATA TRANSCEIVER 
AND ADDRESS LATCH CONTROL 
SIGNALS 


The 82385 data bus is the system side of a latching 
transceiver (typically an F or AS series 74646), and 
the 82385 address bus and cycle definition signals 
are the outputs of an edge-triggered latch (F or AS 
series 74374). The following is a discussion of the 
82385 outputs that control these devices. An impor- 
tant characteristic of these signals and the devices 
they control is that they ensure that BDO-BD31, 

BA2—BA31, BM/IO#, BD/C#, and BW/R# repro- 
duce the functionality and timing behavior of their 
386 DX equivalents. 
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3.5.1 Local Data Strobe (LDSTB), Data 
Output Enable (DOE #), and Bus 
Transmit/Receive (BT/R #) 


These signals control the latching data transceiver. 
BT/R# defines the transceiver direction. When 
high, the transceiver drives the 82385 data bus in 
write cycles. When low, the transceiver drives the 
386 DX data bus in 82385 local bus read cycles. 
DOE # enables the transceiver outputs. 


The rising edge of LDSTB latches the 386 DX data 
bus in all write cycles. The interaction of this signal 
and the latching transceiver is used to perform the 
82385’s posted write capability. | 


3.5.2 Bus Address Clock Pulse 
(BACP) and Bus Address 
Output Enable (BAOE #) 


These signals control the latch that drives BA2- 
BA31, BM/lIO#, BW/R#, and BD/C#. In any 386 
DX cycle that is forwarded to the 82385 local bus, 
the rising edge of BACP latches the 386 DX address 
and cycle definition signals. BAOE# enables the 
latch outputs when the 82385 is the current bus 
master and disables them otherwise. 


3.6 STATUS AND CONTROL 
SIGNALS 


3.6.1 Cache Miss Indication (MISS #) 


This output accompanies cacheable read and write 
miss cycles. This signal transitions to its active low 
state when the 82385 determines that a cacheable 


_386 DX access is a miss. Its timing behavior follows 


that of the 82385 local bus cycle definition signals 
(BM/IO#, BD/C#, BW/R#) so that it becomes 
available with BADS# in BT1 or the first BT2P. 
MISS # is floated when the 82385 does not own the 
bus, such that multiple 82385’s can share the same 
node in multi-cache systems. (As discussed in Chap- 
ter 7, this signal also serves a reserved function in 
testing the 82385.) 


3.6.2 Write Buffer Status (WBS) 


The latching data transceiver is also known as the 
“posted write buffer.” WBS indicates that this buffer 
contains data that has not yet been written to the 
system even though the 386 DX may have begun its 
next cycle. It is activated when 386 DX data is 
latched, and deactivated when the corresponding 
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82385. local bus write cycle is completed 


(BREADY #). (As discussed in Chapter 7, this signal — 


also serves a reserved function in dosting: the 
82385.) 


WBS can serve several functions. In multi-processor 
applications, it can act as a coherency mechanism 
by informing a bus arbiter that it should let a write 
cycle run on the system bus so that main memory 
_ has the latest data. If any other 82385 cache sub- 
systems are on the bus, they will monitor the cycle 
via their bus watching mechanisms. Any 82385 that 
detects a snoop hit will invalidate the corresponding 
entry in its local cache. 


3.6.3 Cache Flush (FLUSH) 


When activated, this signal causes the 82385 to 
clear all of its directory tag valid bits, effectively 
flushing the cache. (As discussed in Chapter 7, this 
signal also serves a reserved function in testing the 
82385.) The primary use of the FLUSH input is for 
diagnostics and multi-processor support. The use of 
this pin as a coherency mechanism may impact soft- 
ware transparency. 


The FLUSH input must be held active for at least 4 
CLK (8 CLK2) cycles to complete the flush se- 
quence. If FLUSH is still active after 4 CLK cycles, 
any accesses to the cache will be misses and the 
cache will not be updated (since FLUSH is active). 


3.7 BUS ARBITRATION SIGNALS 
(BHOLD AND BHLDA) 


In master mode, BHOLD is an input that indicates a 
request by a slave device for bus ownership. The 
82385 acknowledges this request via its BHLDA out- 
put. (These signals function identically to the 386 DX 
HOLD and HLDA signals.) 


The roles of BHOLD and BHLDA are reversed for an 
‘82385 in slave mode. BHOLD is now an output indi- 
cating a request for bus ownership, and BHLDA an 
input indicating that the request has been granted. 


3.8 COHERENCY (BUS WATCHING) 
SUPPORT SIGNALS (SA2-SA31, 
SSTB #, SEN) 


These signals form the 82385’s bus watching inter- 
face. The Snoop Address Bus (SA2-SA31) con- 
nects to the system address lines if masters reside 
at both the system and 82385 local bus levels, or 
the 82385 local bus address lines if masters reside 
only at the 82385 local bus level. Snoop Strobe 
(SSTB#) indicates that a valid address is on the 


82385 


snoop address inputs. Snoop Enable (SEN) indi- 
cates that the cycle is a write. In a system with mas- 
ters only at the 82385 local bus level, SA2-SA31, 
SSTB#, and SEN can be driven respectively by 
BA2-BA31, BADS#, and BW/R# without any sup- 
port circuitry. 


3.9 CONFIGURATION INPUTS 
(2W/D#, M/S#, DEFOE #) 


These signals select the configurations supported 
by the 82385. They are hardware strap options and 
must not be changed dynamically. 2W/D# (2-Way/ 
Direct Mapped Select) selects a two-way set asso- 
ciative cache when tied high, or a direct mapped 
cache when tied low. M/S# (Master/Slave Select) 
chooses between master mode (M/S# high) and 
slave mode (M/S# low). DEFOE # defines the func- 
tionality of the 82385 cache output enables 
(COEA# and COEB#). DEFOE# allows the 82385 
to interface to SRAMs with output enables 
(DEFOE# low) or to SRAMs requiring transceivers 
(DEFOE # high). | | 


4.0 386 DX LOCAL BUS INTERFACE 


The following is a detailed description of how the 
82385 interfaces to the 386 DX and to 386 DX local 
bus resources. Items specifically addressed are the . 
interfaces to the 386 DX, the cache SRAMs, and the 


387 Numerics Coprocessor. 


The many timing diagrams in this and the next chap- 
ter provide insight into the dual pipelined bus struc- 
ture of a 386 DX CPU/82385 system. It’s important 
to realize, however, that one need not know every 
possible cycle combination to use the 82385. The 


_ interface is simple, and the dual bus operation invisi- 
ble to the 386 DX and system. To facilitate discus- 


sion of the timing diagrams, several conventions 
have been adopted. Refer to Figure 4-2A, and note 
that 386 DX bus cycles, 386 DX bus states, and 
82385 bus states are identified along the top. All 


_ states can be identified by the “frame numbers” 


along the bottom. The cycles in Figure 4-2A include 
a cache read hit (CRDH), a cache read miss 
(CRDM), and a write (WT). WT represents any write, 
cacheable or not. When necessary to distinguish 


_ cacheable writes, a write hit goes by CWTH and a 


write miss by CWTM. Non-cacheable system reads 
go by SBRD. Also, it is assumed that system bus 
pipelining occurs even though the BNA# signai is 
not shown. When the system pipeline begins is a 
function of the system bus controller. 


386 DX bus cycles can be tracked by ADS# and 
READYI#, and 82385 cycles by BADS# and | 
BREADY#. These four signals are thus a natural 
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choice to help track parallel bus activity. Note in the 
timing diagrams that 386 DX cycles are numbered 
using ADS # and READYI#, and 82385 cycles using 
BADS# and BREADY #. For example, when the ad- 
dress of the first 386 DX cycle becomes available, 
the corresponding assertion of ADS# is marked 
“1”, and the READYI# pulse that terminates the cy- 
cle is marked ‘1’ as well. Whenever a 386 DX cycle 
is forwarded to the system, its number is forwarded 
as well so that the corresponding 82385 bus cycle 
can be tracked by BADS# and BREADY#. 


The “‘N” value in the timing diagrams is the assumed 
number of main memory wait states inserted in a 
non-pipelined 82385 bus cycle. For example, a non- 
pipelined access to N= 2 memory requires a total of 
four bus states, while a pipelined access requires 
three. (The pipeline advantage effectively hides one 
main memory wait state.) 


4.1 PROCESSOR INTERFACE 


This section presents the 386 DX CPU /82385 hard- 
ware interface and discusses the interaction and 
timing of this interface. Also addressed is how to 
decode the 386 DX address bus to generate the 


82385 


82385 inputs LBA#, NCA#, and X16#. (Recall that 
LBA# allows memory and/or I/O space to be set 
aside for 386 DX local bus resources; NCA# allows 
system memory to be set aside as non-cacheable; 
and X16# allows system memory and/or |/O space 
to be reserved for 16-bit resources.) Finally, the 
82385’s handling of 16-bit space is discussed. 


4.1.1 Hardware Interface 


Figure 4-1 is a diagram of an 386 DX CPU/82385 
system, which can be thought of as three distinct 
interfaces. The first is the 386 DX CPU/82385 inter- 
face (including the Ready Logic). The second is the 
cache interface, as depicted by the cache control 
bus in the upper left corner of Figure 4-1. The third is 
the 82385 bus interface, which includes both direct 
connects and signals that control the 74374 ad- 
dress/cycle definition latch and 74646 latching data 
transceiver. (The 82385 bus interface is the subject 
of the next chapter.) 


As seen in Figure 4-1, the 386 DX CPU/82385 inter- 
face is a straightforward connection. The only nec- 
essary support logic is that required to sum all ready 
sources. | 
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4.1.2 Ready Generation 


Note in Figure 4-1 that the ready logic consists of 
two gates. The upper three-inpbut AND gate (shown 
as a negative logic OR) sums all 386 DX local bus 


ready sources. One such source is the 82385 — 


READYO# output, which terminates read hits and 
posted writes. The output of this gate drives the 386 
DX READY # input and is monitored by the 82385 
(via READYI#) to track the 386 DX bus state. 


When the 82385 forwards a 386 DX read cycle to 
the 82385 bus (cache read miss or non-cacheable 
read), it does not directly terminate the cycle via 
READYO#. Instead, the 386 DX and 82385 bus cy- 
cles are concurrently terminated by a system ready 
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source. This is the purpose of the additional two-in- 
put OR gate (negative logic AND) in Figure 4-1. 
When the 82385 forwards a read to the 82385 bus, it 
asserts BRDYEN# which enables the system ready 
signal (BREADY #) to directly terminate the 386 DX 
bus cycle. 


Figures 4-2A and 4-2B illustrate the behavior of the 
signals involved in ready generation. Note in cycle 1 
of Figure 4-2A that the 82385 READYO# directly 
terminates the hit cycle. In cycle 2, READYO # is not 
activated. Instead the 82385 BRDYEN # is activated 
in BT2, BT2P, or BT2I states such that BREADY # 
can concurrently terminate the 386 DX and 82385 
bus cycles (frame 6). Cycle 3 is a posted write. The - 
write data becomes available in T1P (frame 7), and . 
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the address, data, and cycle definition of the write 
are latched in T2 (frame 8). The 386 DX cycle is 
terminated by READYO# in frame 8 with no wait 
states. The 82385, however, sees the write cycle 
through to completion on the 82385 bus where it is 
terminated in frame 10 by BREADY #. In this case, 
the BREADY# signal is not gated through to the 
386 DX . Refer to Figures 4-2A and 4-2B for clarifi- 
cation. | 


4.1.3. NA# and 386 DX Local Bus 
| Pipelining 


Cycle 1 of Figure 4-2A is a typical cache read hit. 
The 386 DX address becomes available in T1, and 
the 82385 uses this address to determine if the ref- 
erenced data resides in the cache. The cache look- 
up is completed and the cycle qualified as a hit or 
miss in T1. If the data resides in the cache, the 
cache is directed to drive the 386 DX data bus, and 
the 82385 drives its READYO# output so the cycle 
can be terminated at the end of the first T2 with no 
wait states. 


\ 


386™ DX CYCLE CRDM CRDM 
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CLK2 
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Although cycle 2 starts out like cycle 1, at the end of 


T1 (frame 3), it is qualified as a miss and forwarded 
to the 82385 bus. The 82385 bus cycle begins one 
state after the 386 DX bus cycle, implying a one wait 
state overhead associated with cycle 2 due to the 
look-up. When the 82385 encounters the miss, it im- 
mediately asserts NA#, which puts the 386 DX into 
pipelined mode. Once in pipelined mode, the 82385 
is able to qualify an 386 DX cycle using the 386 DX 
pipelined address and control signals. The result is 
that the cache look-up state is hidden in all but the 


first of a contiguous sequence of read misses. This 


is shown in the first two cycles, both read misses, of 
Figure 4-2B. The CPU sees the look-up state in the 
first cycle, but not in the second. In fact, the second 
miss requires a total of only two states, as not only 
does 386 DX pipelining hide the look-up state, but 
system pipelining hides one of the main memory 
wait states. (System level pipelining via BNA # is dis- 
cussed in the next chapter.) Several characteristics 
of the 82385’s pipelining of the 386 DX are as fol- 
lows: | 


— The above discussion applies to all system 
reads, not just cache read misses. 


CROH CRDH 
TIP | T2P | TIP | 12 
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Figure 4-2B. READYO#, BRDYEN#, and NA# (N= 1) 
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— The 82385 provides the fastest possible switch does not occur if the number of main memory 
to pipelining, T1-T2-T2P. The exception to this is wait states is equal to or greater than two. 
when a system read follows a posted write. In 
this case, the sequence is T1-T2-T2-T2P. (Refer As far as the design is concerned, NA# is generally 
to cycle 4 of Figure 4-2A.) The number of T2 tied directly to the 386 DX NA# input. However, oth- 
states is dependent on the number of main er local NA# sources may be logically “AND” ed 
memory wait states. with the 82385 NA# output if desired. It is essential, 

— Refer to the read hit in Figure 4-2A (cycle 1), and however, that no device other than the 82385 drive 
note that NA# is actually asserted before the the 386 DX NA# input unless that device resides on 
end of T1, before the hit/miss decision is made. the 386 DX local bus in space decoded via LBA*. If 
This is of no consequence since even though desired, the 82385 NA# output can be ignored and 
NA# is sampled active in T2, the activation of the 386 DX NA# input tied high. The 386 DX NA# 
READYO¥ in the same T2 renders NA¥ a input should never be tied low, which would always 
“don’t care”. NA# is asserted in this manner to _—“Keep it active. 
meet 386 DX timing requirements and to ensure 
the fastest possible switch to pipelined mode. 

— All read hits and the majority of writes can be 4.1.4 LBA#, NCA#, and X16 # 


serviced by the 82385 with zero wait states in Generation 


non-pipelined mode, and the 82385 accordingly The 82385 input signals LBA#, NCA# and X16# 
attempts to run all such cycles in non-pipelined are generated by decoding the 386 DX address 
mode. An exception is seen in the hit cycles (cy- (A2-A31) and cycle definition (W/R#¥, D/C#, 
cles 3 and 4) of Figure 4-2B. The 82385 does not M/IO#) lines. The 82385 samples them at the end 
know soon enough that cycle 3 is a hit, and thus of the first state in which they become available, 
sustains the pipeline. The result is that three se- which is either T1 or the first T2P cycle. The decode 


quential hits are required before the 386 DX is configuration and timings are illustrated respectively 
totally out of pipelined mode. (The three hits look in Figures 4-3A and 4-3B. 


like T1P-T2P, T1P-T2, T1-T2.) Note that this 
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B. Decode Timing 


Figure 4-3. NCA#, LBA#, X16# Generation 
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4.1.5 82385 Handling of 16-Bit Space 


As discussed previously, the 82385 does not cache 
_ devices decoded as 16-bit. Instead it makes provi- 


sion to accommodate 16-bit space as non-cache- | 


able via the X16# input. X16# is generated when 
the user decodes the 386 DX address and cycle def- 
inition lines for the BS16# input of the 386 DX (Fig- 
_ure 4-3). The decode output now drives both the 386 
DX BS16# input and the 82385 X16# input. Cycles 
decoded this way are treated as non-cacheable. 
They are forwarded to and executed on the 82385 
bus, but have no impact on the cache or cache di- 
rectory. The 82385 also monitors the 386 DX byte 
enables in a 16-bit cycle to see if an additional cycle 
is required to complete the transfer. Specifically, a 
second cycle is required if (BEO# OR BE1#) AND 
(BE2# OR BE3#) is asserted in the current cycle. 
The 82385, like the 386 DX , will not allow the two 
halves of a 16-bit transfer to be interrupted by anoth- 


er master. There is an important distinction between | 


the handling of 16-bit space in a 386 DX system with 
an 82385 as compared to a system without an 
82385. The 386 DX BS16# input need not be as- 
serted until the last state of a 16-bit cycle for the 386 
DX to recognize it as such. The 82385, however, 
needs the information earlier, specifically at the end 
of the first 386 DX bus state (T1 or first T2P) in 
_ which the address of the 16-bit cycle becomes avail- 
able. The result is that in a system without an 82385, 
16-bit devices can define themselves as 16-bit de- 
vices ‘‘on the fly’, while in a system with an 82385, 
16-bit devices should be located in space set aside 
for 16-bit devices via the X16# decode. 


82385 


4.2 CACHE INTERFACE 


The following is a description of the external data 
cache and 82385 cache interface. 


4.2.1 Cache Configurations 


The 82385 controls the cache memory via the con- 
trol signals shown in Figure 4-1. These signals drive 
one of four possible cache configurations, as depict- 
ed in Figures 4-4A through 4-4D. Figure 4-4A shows 
a direct mapped cache organized as 8K double- 
words. The likely design choice is four 8K x 8 


_SRAMs. Figure 4-4B depicts the same cache memo- 


ry but with a data transceiver between the cache 
and 386 DX data bus. In this configuration, CT/R# 
controls the transceiver direction, COEA# drives the 
transceiver output enable. (COEB# could also be 
used, and DEFOE # is strapped high.) A data buffer 
is required if the chosen SRAM does not have a 
separate output enable. Additionally, buffers may be 
used to ease SRAM timing requirements or in a sys- 
tem with a heavily loaded data bus. (Guidelines for 
SRAM selection are included in Chapter 6.) 


Figure 4-4C depicts a two-way set associative cache 
organized as two banks (A and B) of 4K double- 
words each. The likely design choice is sixteen 
4K x 4 SRAM'’s. Finally, Figure 4-4D depicts the two- 
way organization with data buffers between Ne 


cache memory and data bus. 
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Figure 4-4B. Direct Mapped Cache with Data Buffers 
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4.2.2 Cache Control—Direct Mapped 


Figure 4-5A illustrates the timing of cache read and 


. write hits, while Figure 4-5B illustrates cache up- 


dates. In a read hit, the cache output enables are 
driven from the beginning of T2 (cycle 1 of Figure 
4-5A). If at the end of T1 the cycle is qualified as a 
cacheable read, the output enables are asserted on 
the assumption that the cycle will be a hit. (Driving 
the output enables before the actual hit/miss deci- 
sion is made eases SRAM timing requirements.) 


Cycle 1 of Figure 4-5B illustrates what happens 
when the assumption of a hit turns out to be wrong. 
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Note that the output enables are asserted at the be- 
ginning of T2, but then disabled at the end of T2. 
Once the output enables are inactive, the 82385 
turns the transceiver around (via CT/R#) and drives 
the write enables to begin the cache update cycle. 
Note in Figure 4-5B that once the 386 Dx is in pipe- 
lined mode, the output enables need not be driven 
prior to a hit/miss decision, since the decision is 
made earlier via the pipelined address information. 


One consequence of driving the output enables low 


in a miss before the hit/miss decision is made is that 
since the cache starts driving the 386 DX data bus, 
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Figure 4-5A. Cache Read and Write Cycles—Direct Mapped (N= 1) 
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the 82385 cannot enable the 74646 transceiver (Fig- 
ure 4-1) until after the cache outputs are disabled. 
(The timing of the 74646 control signals is described 
in the next chapter.) The result is that the 74646 
cannot be enabled soon enough to support N=0 
main memory (“‘N” was defined in section 4.0 as the 
number of non-pipelined main memory wait states). 
This means that memory which can run with zero 
wait states in a non-pipelined cycle should not be 
mapped into cacheable memory. This should not 
present a problem, however, as a main memory sys- 
tem built with N= 0 memory has no need of a cache. 
(The main memory is as fast as the cache.) Zero 
wait state memory can be supported if it is decoded 
as non-cacheable. The 82385 knows that a cycle is 
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non-cacheable in time not to drive the cache output 
enables, and can thus enable the 74646 sooner. 


In a write hit, the 82385 only updates the cache 
bytes that are meant to be updated as directed by 
the 386 DX byte enables. This prevents corrupting 
cache data in partial doubleword writes. Note in Fig- 
ure 4-5A that the appropriate bytes are selected via 
the cache byte select lines CSO #-—CS3#. In a read 
‘hit, all four select lines are driven as the 386.DX will 
simply ignore data it does not need. Also, in a cache 
update (read miss), all four selects are active in or- 
der to update the cache with a complete line (dou- 
bleword). 
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Figure 4-5B. Cache Update Cycles—Direct Mapped (N= 1) 
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4.2.3 Cache Control—Two-Way Set 
Associative 


Figures 4-6A and 4-6B illustrate the timing of cache 
read hits, write hits, and updates for a two-way set 
associative cache. (Note that the cycle sequences 
are the same as those in Figures 4-5A and 4-5B.) In 
a cache read hit, only one bank on the other is en- 
abled to drive the 386 DX data bus, so unlike the 
control of a direct mapped cache, the appropriate 
cache output enable cannot be driven until the out- 
come of the hit/miss decision is known. (This im- 
plies stricter SRAM timing requirements for a two- 
way set associative cache.) In write hits and read 
misses, only one bank or the other is updated. 
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4.3 387™ DX INTERFACE 


The 387 DX Math Coprocessor interfaces to the 386 
DX just as it would in a system without an 82385. 
The 387 DX READYO # output is logically ‘““AND’’ed 
along with all other 386 DX local bus ready sources 
(Figure 4-1), and the output is fed to the 387 DX 
READY #, 82385 READYI#, and 386 DX READY # 
inputs. 


The 386 DX uniquely addresses the 387 DX by driv- 
ing M/IO# low and A31 high. The 82385 decodes 
this internally and treats 387 DX accesses in the 
same way it treats 386 DX cycles in which LBA# is 
asserted, it ignores them. 
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N = Number of Non-Pipelined, main memory wait states. Must be greater than zero. 


Figure 4-6A. Cache Read and Write Cycles—Two Way Set Associative (N= 1) 
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Figure 4-6B. Cache Update Cycles—Two Way Set Associative (N= 1) 


5.0 82385 LOCAL BUS AND SYSTEM 
INTERFACE ~ 


_ The 82385 system interface is the 82385 Local Bus, 
which presents a “386 DX -like” front end to the 
system. The system ties to it just as it would to a 386 
DX . Although this 386 DX -like front end is function- 
ally equivalent to a 386 DX , there are timing differ- 
ences which can Raely be accounted for in a system 
design. 


The following is a description of the 82385 system 
interface. After presenting the 82385 bus state ma- 
chine, the 82385 bus signals are described, as are 
techniques for accommodating any differences be- 
tween the 82385 bus and 386 DX bus. Following this 
is a discussion of the 82385’s condition upon reset. 


5.1 THE 82385 BUS STATE MACHINE 


5.1.1 Master Mode 


Figure 5-1A illustrates the 82385 bus state machine 
when the 82385 is programmed in master mode. 
Note that it is almost identical to the 386 DX bus 
state machine, only the bus states are 82385 bus 
states (BT1P, BTH, etc.) and the state transitions 
are conditioned by 82385 bus inputs (BNA#, 
BHOLD, etc.). Whereas a ‘‘pending request” to the 
386 DX state machine indicates that the 386 DX ex- 
ecution or prefetch unit needs bus access, a pénd- 
ing request to the 82385 state machine indicates 
that a 386 DX bus cycle needs to be forwarded to 
the system (read miss, non-cacheabie read, write, 
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Figure 5-1A. 82385 Local Bus State Machine—Master Mode 
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Figure 5-1B. 82385 Local Bus State Machine—Slave Mode 
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etc.). The only difference between the state ma- 
chines is that the 82385 does not implement a direct 
BT1P-BT2P transition. If BNA# is asserted in 
BT1P, the resulting state sequence is BT1P-BT2l- 
BT2P. The 82385’s ability to sustain a pipeline is not 
affected by the lack of this state transition. 


5.1.2 Slave Mode 


The 82385’s slave mode state machine (Figure 
- 5-1B) is similar to the master mode machine except 
that now transitions are conditioned by BHLDA rath- 
er than BHOLD. (Recall that in slave mode, the roles 
of BHOLD and BHLDA are reversed from their mas- 
ter mode roles.) Figure 5-2 clarifies slave mode state 
machine operation. Upon reset, a slave mode 82385 
enters the BTH state. When the 386 DX of the slave 
82385 subsystem has a cycle that needs to be for- 
warded to the system, the 82385 moves to BTI and 
issues a hold request via BHOLD. It is important to 
note that a slave mode 82385 does not drive the bus 
in a BTI state. When the master or bus arbiter re- 
turns BHLDA, the slave 82385 enters BT1 and runs 


82385 


the cycle. When the cycle is completed, and if no 
additional requests are pending, the 82385 moves 
back to BTH and disables BHOLD. 


lf, while a slave 82385 is running a cycle, the master 
or arbiter drops BHLDA (Figure 5-2B), the 82385 will 
complete the current cycle, move to BTH and re- 
move the BHOLD request. If the 82385 still had cy- 
cles to run when it was kicked off the bus, it will 
immediately assert a new BHOLD and move to BTI 
to await bus acknowledgement. Note, however, that 
it will only move to BTI if BHLDA is negated, ensur- 
ing that the handshake sequence is completed. 


There are several cases in which a slave 82385 will 
not immediately release the bus if BHLDA is 
dropped. For example, if BHLDA is dropped during a 
BT2P state, the 82385 has already committed to the 
next system bus pipelined cycle and will execute it 
before releasing the bus. Also, the 82385 will com- 
plete the second half of a two-cycle 16-bit transfer, 
or will complete a sequence of locked cycles before 
releasing the bus. This should not present any prob- 
lems, as a properly designed arbiter will not assume 
that the 82385 has released the bus until it sees 
BHOLD become inactive. 
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A. Normal Slave Mode Sequence 
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B. Sequence of Events if Master or Arbiter Drops BHLDA 
Figure 5-2. BHOLD/BHLDA—Slave Mode 
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5.2 The 82385 Local Bus 


The 82385 bus can be broken up into two groups of 
signals: those which have direct 386 DX counter- 
parts, and additional status and control signals pro- 
vided by the 82385. The operation and interaction of 


all 82385 bus signals are depicted in Figures 5-3A _ 


through 5-3L for a wide variety of cycle sequences. 
These diagrams serve as a reference for the 82385 
bus discussion and provide insight into the dual bus 
Operation of the 62385. 


5.2.1 82385 Bus eounenpalte to 
‘386 DX Signals 


The following sections discuss the signals presented 
on the 82385 local bus which are functional equiva- 
lents to the signals present at the 386 DX local bus. 


5.2.1.1 ADDRESS BUS (BA2—BA31) AND 
CYCLE DEFINITION SIGNALS | 
(BM/10#, BD/C#, BW/R#) 


These signals are not driven directly by the 82385, 
but rather are the outputs of the 74374 address/cy- 
cle definition latch. (Refer to Figure 4-1 for the hard- 
ware interface.) This latch is controlled by the 82385 
BACP and BAOE # outputs. The behavior and timing 
of these outputs and the latch they control (typically 
F or AS series TTL) ensure that BA2-—BA31, 
BM/IO#, BW/R#, and BD/C# are compatible in 
timing and function to their 386 DX counterparts. 


The behavior of BACP can be seen in Figure 5-3B, 
where the rising edge of BACP latches and forwards 
the 386 DX address and cycle definition signals in a 
BT1 or first BT2P state. However, the 82385 need 
not be the current bus master to latch the 386 DX 
address, as evidenced by cycle 4 of Figure 5-3A. In 


this case, the address is latched in frame 8, but not 


forwarded to the system (via BAOE#) until frame 
10. (The latch and output enable functions of the 
74374 are independent and invisible to one 
another.) 


82385 


Note that in frames 2 and 6 the BACP pulses are 
marked “False.’’ The reason is that BACP is issued 
and the address latched before the hit/miss deter- 
mination is made. This ensures that should the cycle 
be a miss, the 82385 bus can move directly into BT1 
without delay. In the case of a hit, the latched ad- 
dress is simply never qualified by the assertion of 
BADS #. The 82385 bus stays in BTI if there is no 
access pending (new cycle is a hit) and no bus activ- 
ity. It will move to and stay in BT2I if the system has 
requested a pipelined cycle and the 82385 does not 
have a pending bus access (new cycle is a hit). 


5.2.1.2 DATA BUS (BD0O-BD31) 


The 82385 data bus is the system side of the 74646 
latching transceiver. (See Figure 4-1.) This device is 
controlled by the 82385 outputs LDSTB, DOE #, and 
BT/R#. LDSTB latches data in write cycles, DOE # 
enables the transceiver outputs, and BT/R# con- 
trols the transceiver direction. The interaction of 
these signals and the transceiver is such that BDO- 
BD31 behave just like their 386 DX counterparts. 
The transceiver is configured such that data flow in 
write cycles (A to B) is latched, and data flow in read 
cycles (B to A) is flow-through. 


Although BDO-—BD31 function just like their 386 DX 
counterparts, there is a timing difference that must 
be accommodated for in a system design. As men- 
tioned above, the transceiver is transparent during 
read cycles, so the transceiver propagation delay 
must be added to the 386 DX data setup. In addition, 
the cache SRAM setup must be accommodated for 
in cache read miss cycles. 


For non-cacheable reads the data setup is given by: 


Min BDO-BD31 _  9386DXMin 74646 B-to-A 
Read Data Setup Data Setup Max Propagation 
: st Delay 


5-582 


intel 


The required BDO-BD31 setup in a cache read miss 
is given by: 


Min BDO-BD31 _ 74646 B-to-A Cache SRAM 

Read Data Max Propagation Min Write 

Setup Delay Setup 
OneCLK2 — 82385 CWEA# or 
Period CWEB # Min Delay 


lf a data buffer is located between the 386 DX data 
bus and the cache SRAMs, then its maximum propa- 
gation delay must be added to the above formula as 
~ well. A design analysis should be completed for ev- 
ery new design to determine actual margins. 


A design can accommodate the increased data set- 
up by choosing appropriately fast main memory 
DRAMs and data buffers. Alternatively, a designer 
may deal with the longer setup by inserting an extra 
wait state into cache read miss cycles. If an addition- 
al state is to be inserted, the system bus controller 
should sample the 82385 MISS# output to distin- 
guish read misses from cycles that do not require 
the longer setup. Tips on using the 82385 MISS# 
signal are presented later in this chapter. 


The behavior of LDSTB, DOE#, and BT/R# can be 
understood via Figures 5-3A through 5-3L. Note that 
in cycle 1 of Figure 5-3A (a non-cacheable system 
read), DOE# is activated midway through BT1, but 
in cycle. 1 of Figure 5-3B (a cache read miss), DOE # 
is not activated until midway through BT2. The rea- 
son is that in a cacheable read cycle, the cache 
SRAMs are enabled to drive the 386 DX data bus 
before the outcome of the hit/miss decision (in an- 
ticipation of a hit). In cycle 1 of Figure:5-3B, the as- 
sertion of DOE# must be delayed until after the 
82385 has disabled the cache output buffers. The 
result is that N=0O main memory should not be 
mapped into the cache. 


5.2.1.3 BYTE ENABLES (BBEO# -BBE3#) 


These outputs are driven directly by the 82385, and 
are completely compatible in timing and function 
with their 386 DX counterparts. When a 386 DX cy- 
cle is forwarded to the 82385 bus, the 386 DX byte 
enables are duplicated on BBEO#-BBE3#. The 
one exception is a cache read miss, during which 
BBEO#—-BBE3# are all active regardless of the 
status of the 386 DX byte enables. This ensures that 
the cache is updated with a valid 32-bit entry. 


82385 


5.2.1.4 ADDRESS STATUS (BADS #) 


BADS # is identical in function and timing to its 386 
DX counterpart. It is asserted in BT1 and BT2P 
states, and indicates that valid address and cycle 
definition (BA2—BA31, BBEO#-BBE3#, BM/IO#, 
BW/R#, BD/C#) information is available on the 
82385 bus. 


5.2.1.5 READY (BREADY #) 


The 82385 BREADY # input terminates 82385 bus 
cycles just as the 386 DX READY # input terminates 
386 DX bus cycles. The behavior of BREADY # is 
the same as that of READY #, but note in the A.C. 
timing specifications that a cache read miss requires 
a longer BREADY # setup than do other cycles. This 
must be accommodated for in ready logic design. 


5.2.1.6 NEXT ADDRESS (BNA#) 


BNA# is identical in function and timing to its 386 
DX counterpart. Note that in Figures 5-3A through 5- 
3L, BNA# is assumed asserted in every BT1P or 
first BT2 state. Along with the 82385’s pipelining of 
the 386 DX , this ensures that the timing diagrams 
accurately reflect the full pipelined nature of the dual 
bus structure. 


5.2.1.7 BUS LOCK (BLOCK #) 


The 386 DX flags a locked sequence of cycles by 
asserting LOCK#. During a locked sequence, the 
386 DX does not acknowledge hold requests, so the 
sequence executes without interruption by another 
master. The 82385 forces all locked 386 DX cycles 
to run on the 82385 bus (unless LBA# is active), 
regardless of whether or not the referenced location 
resides in the cache. In addition, a locked sequence 
of 386 DX cycles is run as a locked sequence on the 
82385 bus; BLOCK# is asserted and the 82385 
does not allow the sequence to be interrupted. 
Locked writes (hit or miss) and locked read misses 
affect the cache and cache directory just as their 
unlocked counterparts do. A locked read hit, howev- 
er, is handled differently. The read is necessarily 
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_ forced to run on the 82385 local bus, and as the 
data returns from main memory, it is “‘re-copied”’ into 
the cache. (See Figure 5-3L.) The directory is not 
changed as it already indicates that this location ex- 
ists in the cache. This activity is invisible to the sys- 
tem and ensures that somapnores are properly han- 
dled. 


BLOCK# is asserted during locked 82385 bus cy- 
cles just as LOCK# is asserted during locked 386 
DX cycles. The BLOCK# maximum valid delay, 
however, differs from that of LOCK #, and this must 
be accounted for in any circuitry that makes use of 
BLOCK#. The difference is due to the fact that 
LOCK #, unlike the other 386 DX cycle definition sig- 
nals, is not pipelined. The situation is clarified in Fig- 
ure 5-3K. In cycle 2 the state of LOCK# is not 
known before the corresponding system read starts 
(Frames 4 and 5). In this case, LOCK# is asserted 
at the beginning of T1P, and the delay for BLOCK # 
to become active is the delay of LOCK# from the 
386 DX plus the propagation delay through the 
82385. This occurs because T1P and the corre- 
sponding BT1P are concurrent (Frame 5). The result 
is that BLOCK# should not be sampled at the end 
of BT1P. The first appropriate sampling point is mid- 
way through the next state, as shown in Frame 6. In 
Figure 5-3L, the maximum delay for BLOCK # to be- 
come valid in Frame 4 is the same as the maximum 
delay for LOCK # to become valid from the 386 DX . 
This is true since the pipelining issue discussed 
above does not occur. _ 


The 82385 should negate BLOCK# after 
BREADY # of the last 82385 Locked Cycle was as- 
serted and Lock turns inactive. This means that in a 
sequence of cycles which begins with a 82385 
Locked Cycle and goes on with all the possible 
Locked Cycles (other 82385 cycles, idles, and local 
cycles), while LOCK# is continuously active, the 
82385 will maintain BLOCK# active continuously. 
Another implication is that in a Locked Posted Write 
Cycle followed by non-locked sequence, BLOCK # 
is negated one CLK after BREADY# of the write 
cycle. In other 82385 Locked Cycles, followed by 
non-locked sequences, BLOCK# is negated one 
CLK after LOCK# is negated, which occurs two 
CLKs after BREADY # is asserted. In the last case 
BLOCK# active moves by one CLK to the non- 
locked sequence. 


The arbitration rules of Locked Cycles are: 


MASTER MODE: 


BHOLD. input signal is ignored when BLOCK# or 
internal lock (16-bit non-aligned cycle) are active. 
BHLDA output signal remains inactive, and BAOE # 
output signal remains active at that time interval. 


SLAVE MODE: 


The 82385 does not relinquish the system bus if 
BLOCK# or internal lock are active. The BHOLD 
output signal remains active when BLOCK # or inter- 
nai lock is active plus one CLK. The BHLDA input 
signal is ignored when BLOCK # or the internal lock 
is active plus one CLK. This means the 82385 slave 
does not respond to BHLDA inactivation. The 
BAOE# output signal remains active during the 
same time interval. 


5.2.2 Additional 82385 Bus Signals 


The 82385 bus provides two status outputs and one 
control input that are unique to cache operation and 
thus have no 386 DX counterparts. The outputs are 
MISS #, and WBS, and the input is FLUSH. 


5.2.2.1 CACHE READ/WRITE MISS 
INDICATION (MISS#) | 


MISS # can be thought of as an extra 82385 bus 

cycle definition signal similar to BM/lIO#, BW/R#, 
and BD/C#, that distinguishes cacheable read and 
write misses from other cycles. MISS #, like the oth- 
er definition signals, becomes valid with BADS# 
(BT1 or first BT2P). The behavior of MISS # is illus- 
trated in Figures 5-3B, 5-3C, and 5-3J. The 82385 
floats MISS # when another master owns the bus, 
allowing multiple 82385s to share the same node in 
multi-cache systems. MISS# should thus be lightly 
pulled up (~ 20 KQ) to keep it ine during hold: 
(BTH) states. - | 


MISS# can serve several purposes. As discussed 
previously, the BDO-BD31 and BREADY# setup 
times in a cache read miss are longer than in other 
cycles. A bus controller can distinguish these cycles 


_by gating MISS# with BW/R#. MISS# may also 


prove useful in gathering pases system perform- 
ance data. 


- §.2.2.2 WRITE BUFFER STATUS (WBS) 


WBS is activated when 386 DX write cycle data is 
latched into the 74646 latching transceiver (via 
LDSTB). It is deactivated upon completion of the 
write cycle on the 82385 bus when the 82385 sees 
the BREADY # signal. WBS behavior is illustrated in 
ies 5-3F through 5-3J, and potential applica- 


ns are discussed in Chapter 3. 
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Figure 5-3B. Consecutive CRDM Cycles—(N = 1) 
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- Figure 5-3C. SBRD, CRDM, SBRD—(N = 2) 
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Figure 5-3D. SBRD Cycles Interleaved with BTH States—(N = 1) 


5-586 © 


Intel 82385 


386™ Dx CYCLE CRDH SBRD CRDH SBRD 


386 DX BUS STATE | T1 T2 11 T2 T2P | T2P | TIP T2 T1 T2 T2P | T2P 
82385 BUS STATE | BTI BTI BTI BT1 BT2 | BT2I BTI BTl BT BT1 BT2 | BT2P 


CLK2 


CLK 


ADS# ; 


BREADY# 


BACP 


DOE# 


FRAME 


NUMBER 191 46 


2901 43-33 


386™ px CYCLE SBRD WT SBRD CRDH 


386 DX BUS STATE; T1 T2 T2P | T2P | TIP T2 T1 T2 T2 T2P | TIP | T2P 
82385 BUS STATE | BTI BT1 | BT2 | BT2i | BTI BT1 | BT2 | BT2P | BTIP | BT2!I | BT BTI 


[Ie G+ 


Ln 
: 
x 


~ WBS 


FRAME 


NUMBER Le 


290143-—34 


Figure 5-3F. SBRD, WT, SBRD, CRDH—(N = 1) 
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. Figure 5-3G. Interleaved WT/CRDH Cycles—(N = 1) 
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Figure 5-3J. Consecutive Write Cycles—(N = 1) 
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Figure 5-3K. LOCK #/BLOCK # in Non-Cacheable or Miss Cycles—(N = 1) 
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Figure 5-3L. LOCK #/BLOCK# in Cache Read Hit Cycle—(N = 1) 
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5.2.2.3 CACHE FLUSH (FLUSH) 
FLUSH is an 82385 input which is used to reset all 


tag valid bits within the cache directory. The FLUSH 


input must be kept active for at least 4 CLK (8 CLK2) 
periods to complete the directory flush. Flush is gen- 
erally used in diagnostics but can also be used in 
applications where snooping cannot guarantee co- 
herency. 


5.3 BUS WATCHING (SNOOP) 
INTERFACE 


The 82385’s bus watching interface consists of the 
snoop address (SA2-SA31), snoop _ strobe 
(SSTB#), and snoop enable (SEN) inputs. If mas- 
ters reside at the system bus level, then the SA2- 
SA31 inputs are connected to the system address 
lines and SEN the system bus memory write com- 
mand. SSTB # indicates that a valid address is pres- 
ent on the system bus. Note that the snoop bus in- 
puts are synchronous, so care must be taken to en- 
sure that they are stable during their sample win- 


dows. If no master resides beyond the 82385 bus. 


level, then the 82385 inputs SA2-SA31, SEN, and 
SSTB# can respectively tie directly to BA2-BA31, 
BW/R#, and BADS # of the other system bus mas- 
ter (see Figure 5.5). However, it is recommended 
that SEN be driven by the logical “AND” of BW/R# 
and BM/IO# so as to prevent I/O writes from un- 
necessarily invalidating cache data. 
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When the 82385 detects a system write by another 
master and the conditions in Figure 5.4 are met: 
CLK2 PHI1 rising (CLK falling), BHLDA asserted, 
SEN asserted, SSTB# asserted, it internally latches 
SA2-SA31 and runs a cache look-up to see if the 
altered main memory location is duplicated in the 
cache. If yes (a snoop hit), the line valid bit associat- 
ed with that cache entry is cleared. An important 
feature of the 82385 is that even if the 386 DX is 
running zero wait state hits out of the cache, all 
snoops are serviced. This is accomplished by time 
multiplexing the cache directory between the 386 
DX address and the latched system address. If the 
SSTB# signal occurs during an 82385 comparison 
cycle (for the 386 DX), the 386 DX cycle has the 
highest priority in accessing the cache directory. 
This takes the first of the two 386 DX states. The 
other state is then used for the snoop comparison. 
This worst case example, depicted in Figure 5-4, 
shows the 386 DX running zero wait state hits on the 
386 DX local bus, and another master running zero 
wait state writes on the 82385 bus. No snoops are 
missed, and no performance penalty incurred. 


5.4 RESET DEFINITION 


Table 5-1 summarizes the states of all 82385 out- 
puts during reset and initialization. A slave mode 
82385 tri-states its “386 DxX-like’’ front end. A mas- 
ter mode 82385 emits a pulse stream on its BACP 
output. As the 386 DX address and cycle definition 
lines reach their reset values, this stream will latch 
the reset values through to the 82385 bus. 
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NOTES: 


*4. These states are induced by another System Bus master. 
*2. SSTB# on the 82385 is tied directly to BADS# of the System Bus master. 
*3. SEN on the 82385 is tied directly to BW/R# of the System Bus master. 


Figure 5.4. Interleaved Snoop and 386 DX Accesses to the Cache Directory 
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Figure 5.5. Snooping Connections in a Multi Master Environment 
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| | Table 5-1. Pin State During RESET and Initialization 


CWEA # -CWEB # 


NA# 


BT/R# 
LDSTB 


NOTE: 


a 
pCALEN igh Hh 


| Low 

Low oe 
Low | | 
Low _ . 
Low 

Low 


ee 
a 


Low 
Low 
Ow 


1. In Master Mode, BAOE # is low and BACP emits a pulse stream during reset. As the 386 DX address and cycle definition 
signals reach their reset values, the pulse stream on BACP will latch these values through to the 82385 local bus. 


6.0 82385 SYSTEM DESIGN 
CONSIDERATIONS 


6.1 INTRODUCTION 


This chapter discusses techniques which should be 
implemented in an 82385 system. Because of the 
high frequencies and high performance nature of the 
386 DX CPU/82385 system, good design and layout 
techniques are necessary. It is always recommend- 
ed to perform a complete design analysis on new 
system designs. 


6.2 POWER AND GROUNDING 


6.2.1 Power Connections 


The PGA 82385 utilizes 8 power (Vcc) and 10 
ground (Vss) pins. The PQFP 82385 has 9 power 
and 9 ground pins. All Vcc and Vss pins must be 
connected to their appropriate plane. On a printed 
circuit board, all Vcc pins must be connected to the 


_ power plane and all Vss pins must be connected to 


the ground plane. | 


6.2.2 Power Decoupling 


Although the 82385 itself is generally a “passive” 
device in that it has few output signals, the cache 
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subsystem as a whole is quite active. Therefore, 
many decoupling capacitors should be placed 
around the 82385 cache subsystem. 


Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
circuit board traces between the decoupling capaci- 
tors and their respective devices as much as possi- 
ble. Capacitors specifically for PGA packages are 
also commercially available, for the lowest possible 
— inductance. 


6.2.3 Resistor Recommendations 


Because of the dual bus structure of the 82385 sub- 
system (386 DX Local Bus and 82385 Local Bus), 
any signals which are recommended to be pulled up 
will be respective to one of the busses. The follow- 
ing sections will discuss signals for both busses. 


6.2.3.1 386 DX LOCAL BUS 


For typical designs, the pullup resistors shown in Ta- 
ble 6-1 are recommended. This table correlates to 
Chapter 7 of the 386 DX Data Sheet. However, par- 
ticular designs may have a need to differ from the 
listed values. Design analysis is recommended to 
determine specific requirements. 


6.2.3.2 82385 LOCAL BUS 


Pullup resistor recommendations for the 82385 Lo- 
cal Bus signals are shown in Table 6-2. Design anal- 
ysis is necessary to determine if deviations to the 
typical values given is needed. 


Table 6-1. Recommended Resistor Pullups to 
Voc (386 DX Local Bus) 


Pin and Pullup Bureose | 
Signal Value P 


ADS# 20 KX. +10% | Lightly Pull ADS # 
PGA E13 Negated for 386 DX 
PQFP 123 Hold States | 


LOCK # Lightly Pull LOCK # 
PGA F13 Negated for 386 DX 
PQFP 118 Hold States 


20 KO +10% 


82385 


Table 6-2. Recommended Resistor Pullups to 
Vcc (82385 Local Bus) 


Signal and Pullup 
Pin 
BADS # 20 KX. +10% } Lightly Pull BADS # 
PGA N9 _| Negated for 82385 
Hold States 
20 KQ. +10% | Lightly Pull BLOCK # 


PQFP 89 
Negated for 82385 


BLOCK # 
PGA P9 
Hold States 


PQFP 86 


MISS # 20 KX. +10% | Lightly Pull MISS # 
PGA N8 Negated for 82385 
PQFP 85 Hold States 


6.3 82385 SIGNAL CONNECTIONS 


6.3.1 Configuration Inputs 


The 82385 configuration signals (M/S#, 2W/D#, 
DEFOE #) must be connected (pulled up) to the ap- 
propriate logic level for the system design. There is 
also a reserved 82385 input which must be tied to 
the appropriate level. Refer to Table 6-3 for the sig- 
nals and their required logic level. 


Table 6-3. 82385 Configuration 
Inputs Logic Levels 


Pin and Logic BiurpoSsé 
Signal Level P 
M/S# Master Mode Operation 
PGA B13 Slave Mode Operation — | 
Low 


PQFP 129 
2-Way Sei Associative 


| 2W/D# 
| bow Direct Mapped 


PGA D12 


| PQFP 127 
N/A 


Resrved 
PGA L14 
PQFP 102 


DEFOE # 
PGA A14 
PQFP 128 


Must be tied to Vcc via 
a pull-up for proper 
functionality 


Define Cache Output 
Enables. Allows use of 
any SRAM. 


NOTE: 
The listed 82385 pins which need to be tied high should 
use a pull-up resistor in the range of 5 KO. to 20 KO. 
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6.3.2 CLK2 and RESET 


The 82385 has two. inputs to which the 386 DX 
CLK2 signal must be connected. One is labeled 
CLK2 (82385 PGA pin C13, PQFP lead 126) and the 
other is labeled BCLK2 (82385 PGA pin L13, PQFP 
lead 103). These two inputs must be tied eer on 
the printed circuit board. 


The 82385 also has two reset inputs. RESET (82385 
PGA pin D13, PQFP lead 125) and BRESET (82385 
PGA pin K12, PQFP lead 104) must be connected 
on the punter circuit board. 


6.4 UNUSED PIN REQUIREMENTS 


For reliable operation, ALWAYS connect unused in- 
puts to a valid logic level. As is the case with most 
other CMOS processes, a floating input will increase 
the current consumption of the component and give 
an indeterminate state to the component. 


6.5 CACHE SRAM REQUIREMENTS 


The 82385 offers the option of using SRAMs with or 
without an output enable pin. This is possible by in- 
serting a transceiver between the SRAMs and the 
386 DX local data bus and strapping DEFOE# to 
the appropriate logic level for a given system config- 
uration. This transceiver may also be desirable in a 
system which has a very heavily loaded 386 DX lo- 
cal data bus. The following sections discuss the 
SRAM requirements for all cache configurations. 


82385 


6.5.1 Cache iiainery without — 
| Transceivers 


As discussed in Section 3.2, the 82385 presents all 
of the control signals necessary to access the cache 
memory. The SRAM chip selects, write enables, and 
output enables are driven directly by the 82385. Ta- 
ble 6-4 lists the required SRAM specifications. 
These specifications allow for zero margins. They 
should be used as guides for the actual system de- 
st. 


6.5.2 Cache Memory With — 
Transceivers | 


To implement an 82385 subsystem using cache 


memory transceivers, COEA# or COEB# must be 
used as output enable signals for the transceivers 
and DEFOE# must be appropriately strapped for 
proper COEx# functionality (since the cache SRAM 
transceivers must be enabled for writes as well as 
reads ). DEFOE# must be tied high when using 
cache SRAM transceivers. In a 2-way set associa- 
tive organization, COEA# enables the transceiver 
for bank A and COEB# enables the bank B trans- 
ceiver. A direct mapped cache may use either 
COEA# or COEB # to enable the transceiver. Table 
6-5 lists the required SRAM specifications. These 
specifications allow for zero margin. They should be 
used as guides for the actual system design. 


Table 6-4. SRAM Specs for Non-Buffered Cache Memory 


SRAM Spec Requirements 
Direct Mapped 


po 


Read Cycle Requirements 
Address Access (MAX) 
Chip Select Access (MAX) 
OE # to Data Valid (MAX) 
OE # to Data Float (MAX) 


Write Cycle Requirements 


Chip Select to End of Write (MIN) 30 
Address Valid to End of Write (MIN) 42 
Write Pulse Width (MIN) : 30 
Data Setup (MAX) — 
Data Hold (MIN) 4 


2-Way Set Associative 


25 33 20 25 33 


37 29 40 37 29 
25 20 30 25 20 
4 2 4 4 2 
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Table 6-5. SRAM Specs for Buffered Cache Memory 


| SRAM Spec Requirements 


Direct Mapped 
20 25 33 20 25 33 


Read Cycie Requirements 
Address Access (MAX) 
Chip Select Access (MAX) 
‘OE # to Data Valid (MAX) 
OE # to Data Float (MAX) 


Write Cycle Requirements 

Chip Select to End of Write (MIN) 
Address Valid to End of Write (MIN) 
Write Pulse Width (MIN) 

Data Setup (MAX) 

Data Hold (MIN) 


7.0 SYSTEM TEST CONSIDERATIONS 


7.1 INTRODUCTION 


Power On Self Testing (POST) is performed by most 
systems after a reset. This chapter discusses the 
requirements for properly testing an 82385 based 
system after power up. 


7.2 MAIN MEMORY (DRAM) TESTING 


Most systems perform a memory test by writing a 
data pattern and then reading and comparing the 
data. This test may also be used to determine the 
total available memory within the system. Without 
properly taking into account the 82385 cache mem- 
ory, the memory test can give erroneous results. 
This will occur if the cache responds with read hits 
during the memory test routine. 


7.2.1 Memory Testing Routine 


In order to properly test main memory, the test rou- 
tine must not read from the same block consecutive- 
ly. For instance, if the test routine writes a data pat- 
tern to the first 32 kbytes of memory (OO00-—7FFFRH), 
reads from the same block, writes a new pattern to 
the same locations (QOOO-—7FFFH), and reads the 
new pattern, the second pattern tested would have 
had data returned from the 82385 cache memory. 
Therefore, it is recommended that the test routine 
work with a memory block of at least 64 kbytes. This 
will guarantee that no 32 kbyte block will be read 
twice consecutively. 


2-Way Set Associative 


20 35 29 
27 48 36 
N/A N/A N/A 
N/A N/A N/A 


7.3 82385 CACHE MEMORY TESTING 


With the addition of SRAMs for the cache memory, it 
may be desirable for the system to be able to test 
the cache SRAMs during system diagnostics. This 


requires the test routine to access only the cache 
-memory. The requirements for this routine are based 


on where it resides within the memory map. This can 
be broken into two areas: the routine residing in 
cacheable memory space or the routine residing in 
either non-cacheable memory or on the 386 DX lo- 
cal bus (using the LBA# input). 


7.3.1 Test Routine in the NCA# or 
LBA# Memory Map 


In this configuration, the test routine will never be 
cached. The recommended method is code which 
will access a single 32 kbyte block during the test. 
Initially, a 32 kbyte read (assume OO000-—7FFFH) 
must be executed. This will fill the cache directory 
with the address information which will be used in 
the diagnostic procedure. Then, a 32 kbyte write to 
the same address locations (QO00-—7FFFH) will load 
the cache with the desired test pattern (due to write 
hits). The comparison can be made by completing 
another 32 kbyte read (same locations, OO000- 
7FFFH), which will be cache read hits. Subsequent _ 
writes and reads to the same addresses will enable 
various patterns to be tested. 
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7.3.2 Test Routine in eaecene 
Memory 


In this case, it must be understood that the diagnos- 
tic routine must reside in the cache memory before 
the actual data testing can begin. Otherwise, when 
the 386 DX performs a code fetch, a location within 
the cache memory which is to be tested will be al- 
tered due to the read miss (code fetch) update. 


The first task is to load the diagnostic routine into 
the top of the cache memory. It must be known how 
much memory is required for the code as the rest of 
‘the cache memory will be tested as in the earlier 
method. Once the diagnostics have been cached 
(via read updates), the code will perform the same 
type of read/write/read/compare as in the routine 
explained in the above section. The difference is 
that now the amount of cache memory to be tested 
is 32.kbytes minus the length of the test routine. 


7.4 82385 CACHE DIRECTORY 
TESTING 


Since the 82385 does not directly access the data 
bus, it is not possible to easily complete a compari- 
son of the cache directory. (The 82385 can serially 
transmit its directory contents. See Section 7.5.) 
However, the cache memory tests described in Sec- 
tion 7.3 will indicate if the directory is working prop- 
erly. Otherwise, the data comparison within the diag- 
nostics will show locations which fail. 


There is a slight possibility that the cache memory 
comparison could pass even if locations within the 
directory gave false hit/miss results. This could 
cause the comparison to always be performed to 
main memory instead of the cache and give a proper 
comparison to the 386 DX . The solution here is to 
use the MISS # output of the 82385 as an indicator 
to a diagnostic port which can be read by the 386 
DX . It could also be used to flag an interrupt if a 
failure occurs. 


The implementation of these techniques in the diag- 
nostics will assure proper rancuenaly of the 82385 
subsystem. 


82385 


7.5 SPECIAL FUNCTION PINS 


As mentioned in Chapter 3, there are three 82385 
pins which have reserved functions in addition to 
their normal operational functions. These pins are 
MISS#, WBS, and FLUSH. 


As discussed previously, the 82385 performs a di- 
rectory flush when the FLUSH input is held active for 
at least 4 CLK (8 CLK2). cycles. However, the 
FLUSH pin also serves as a diagnostic input to the 
82385. The 82385 will enter a reserved mode if the 
FLUSH pin is high at the falling edge of RESET. 


lf, during normal operation, the FLUSH input is ac- 
tive for only one CLK (2 CLK2) cycle/s, the 82385 
will enter another reserved mode. Therefore it must 
be guaranteed that FLUSH is active for at least the 4 
CLK (8 CLK2) cycle specification. | 


WBS and MISS# serve as outputs in the 82385 re- 
served modes. 


8.0 MECHANICAL DATA 


8.1 INTRODUCTION 


This chapter discusses the physical package and its 


- connections in detail. 


8.2 PIN ASSIGNMENT 


The 82385 pinout as viewed from the top side of the 
component is shown by Figure 8-1. Its pinout as 
viewed from the Pin side of the component is shown 
in Figure 8-2. 


~ Veg and Vgg connections must be made to multiple 


Voc and Vss (GND) pins. Each Vcc and Vss must 
be connected to the appropriate voltage level. The 
circuit board should include Vcc and GND planes for 
power distribution and all Vcc and Vss pins must be 
connected to the appropriate plane. 
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Figure 8-1. 82385 PGA Pinout—View from TOP Side 
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Figure 8-2. 82385 PGA Pinout—View from PIN Side 
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Figure 8-3. 82385 PQFP Pinout—View from TOP Side 
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Table 8-1. 82385 Pinout—Functional Grouping 


130 SA31 

131 SA30 

132 SA29 
SA28 
SA27 
SA26 
SA25 
SA24 
SA23 
SA22 
SA21 
SA20 
SA19 
SA18 
SA17 
SA16 
SA15 
SA14 
SA13 117 X16# 
SA12 | NCA# 
SA11_ LBA# 
SA10 READYI# 
SAQ READYO# 
SA8 
SA7 FLUSH 
SAG WBS 
SA5 | MISS # 
SA4 
SA3 - DEFOE# 
SA2 2W/D# 
SEN M/S# 

DOE# 
LDSTB 


Vss 
Vss 
Vss 
Vss _ 
Vss - 
Vss 
Vss 
Vss 
Vss 
Vss | 


BADS # 
BBEO # 
BBE1# 

. BBE2# 
BBE3S # 
BLOCK # 


BNA#. 


CALEN 
COEA # 
COEB # 
CWEA # 
CWEB# 
CSO # 
CS1i# 
CS2# 
CS3 # 


CT/R# 


_BRDYEN#| 
_BREADY # 

- BACP 
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BHOLD 
BHLDA 
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8.3 PACKAGE DIMENSIONS AND 
MOUNTING 


The 82385 package is a 132-pin ceramic Pin Grid 
Array (PGA). The pins are arranged 0.100 inch 
(2.5 mm) center-to-center, in a 14 x 14 matrix, three 
rows around (Figure 8-3). 


A wide variety of available sockets allow low inser- 
tion force or zero insertion force mounting. These 
come in a choice of terminals such as soldertail, sur- 
face mount, or wire wrap. 
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<——_________——_ 1.450 (36.802) 


550 (13.959) 
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’— 725 (18.401) 
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8.4 PACKAGE THERMAL 
SPECIFICATION 


The PGA case temperature should be measured at 
the center of the top surface opposite the pins, as in 
Figure 8-4. The case temperature may be measured 
in any environment to determine whether or not the 
82385 is within the specified operating range. 
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Figure 8-3.1. 132-Pin PGA Package Dimensions 
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Figure 8-3.2. Principal Dimensions and Datums 
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Figure 8-3.3. Molded Details © 
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Figure 8-3.4. Terminal Details 


5-604 


82385 


[@ [9.13 (885) @|[c]A@-8© |0® VA 


8.41 (.816) 
6.28 (.888) 


6.31 €.812) =| & 
8.28 (.888) 


| © [8.28 008) @ |c]A@-BO©[D@ Va 


mm (inch) 


290143-61 


Figure 8-3.5. Typical Lead 
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Figure 8-3.6. Detail M 


PLASTIC QUAD FLAT PACK 


Table 8-2. Symbol List for Plastic Quad Flat Pack 


Description 
of Dimensions 


Package height: distance 
from seating plane to 
highest point of body 


1 Standoff: Distance from 
seating plane to base plane | 
/E Overall package dimension: 
lead tip to lead tip 


ore [Pest body dimension 


x 


NOTES: 

1. All dimensions and tolerances conform to ANSI Y14.5M- 
1982. 

2. Datum plane -H- located at the mold parting line and 
coincident with the bottom of the lead where lead exits 
plastic body. 

3. Datums A-B and -D- to be determined where center 
leads exit plastic body at datum plane -H-. 

4. Controlling Dimension, Inch. 

5. Dimensions D1, D2, E1 and E2 are measured at the 
mold parting line and do not include mode protrusion. Al- 
lowable mold protrusion of 0.18mm (0.007 in.) per side. 

6. Pin 1 identifier is located within one of the two zones 
indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 
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MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 


132 = PIN PGA 
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Figure 8-4, Measuring 82385 PGA Case Temperature 
Table 8-3. 82385 PGA Package Typical Thermal Characteristics. 


Thermal Resistance—°C/Watt 
Airflow—f?/min (m3/sec) | 


400 600 800 | 
(2. = < = (4. = 


Parameter 


6 Junction-to-Case 
(Case Measured 


as Figure 8.4) 


@ Case-to-Ambient 
(with Omnidirectional 
Heatsink) 


6 Case-to-Ambient 
(with Unidirectional 
Heatsink) 


NOTES: 
1. Table 8-3 applies to 82385 PGA plugged into socket or soldered directly onto board. 
2. OA = Ojo + Oa. 
3. Oj.cap = 4°C/W (approx.) 
05-piIn = 4°C/W (inner pins) (approx.) 
Oy. PIN = 8°C/W (outer pins) (approx.) 


tll 


D.0.4.¢.4.@ 
8, cap 
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Table 8-4. 82385 132-Lead PQFP Package Typical Thermal Characteristics 
Thermal Resistance—C/ Watt 


@ Junction-to-Case 
(Case Measured 
as Figure 8.4) 


8 Case-to-Ambient 
(No Heatsink) 


@ Case-to-Ambient 
(with Omnidirectional 
Heatsink) 


TO BE DEFINED 


@ Case-to-Ambient 
(with Unidirectional 
Heatsink) 


NOTES: 
1. Table 8-4 applies to 82385 PQFP plugged into socket or soldered directly onto board. 
2. Oya = Ojo + 9Ca~. 
3. Oy.cap = 4°C/W (approx.) 
Oy-pin = 4°C/W (inner pins) (approx.) 
0)-pin = 8°C/W (outer pins) (approx.) 
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9.0 ELECTRICAL DATA 


9.1 INTRODUCTION 


This chapter presents the A.C. and D.C. specifica- 


tions for the 82385. 


9.2 MAXIMUM RATINGS 


Storage Temperature .......... —65°C to + 150°C. 


Case Temperature under Bias ... —65°C.to + 110°C 
Supply Voltage with Respect 

TOV SE ds eee ylbittss ance hone oes —0.5V to + 6.5V 

Voltage on Any Other Pin .... —0.5V to Voc + 0.5V 


82385 


: NOTE: 
Stress above those listed may cause permanent 


damage to the device. This is a stress rating only 


and functional operation at these or any other con- 
ditions above those listed in the operational sec- 
tions of this specification is not implied. 


~ Exposure to absolute maximum rating conditions 


for extended periods may affect device reliability. 
Although the 82385 contains protective circuitry to 
resist damage from static electrical discharges, al- 
ways take precautions against high static voltages 
or electric fields. 


9.3 D.C. SPECIFICATIONS Voc = 5V £5%: Veg = OV 
Table 9-1. D.C. Specifications 


PV Input Low Voltage | 


—— 
ae 
De 
vou | Output righ otage 
i 
ane 


ViL 
VIH 
VoL 
VCH 
VOL 
_VOH 
loc 
IL 
ILO 
CIN 


Phe: -_« Input Capacitance | 


NOTES: . 
1. Minimum value is not 100% tested. 


- Test Condition 
(Note 1) 


[Unit _| 

2 ae 

Pv [ton = 1mm | 
(Note 4) 


2. Icc is specified with inputs driven to CMOS levels. Icc may be higher if driven to TTL levels. 
3. Not 100% tested. Test conditions fo = 1 MHz, Inputs = OV, Tcase = Room. 


4, 300 mA is the maximum Icc at 33 MHz. 
275 mA is the maximum Icc at 25 MHz. 
250 mA is the maximum Icc at 20 MHz. 


‘ f 
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9.4 A.C. SPECIFICATIONS 


The A.C. specifications given in the following tables 
consist of output delays and input setup require- 
ments. The A.C. diagram’s purpose is to illustrate 
the clock edges from which the timing parameters 
are measured. The reader should not infer any other 
timing relationships from them. For specific informa- 
tion on timing relationships between signals, refer to 
the appropriate functional section. 


A.C. spec measurement is defined in Figure 9-1. In- 
puts must be driven to the levels shown when A.C. 
specifications are measured. 82385 output delays 


<— A 


<B 
MIN . 


VALID 
OUTPUT n 


<= C —>/<— DP —> 


3.0V 


OV 


LEGEND: 

A—Maximum output delay specification 
B—Minimum output delay specification 
C—Minimum input setup specification © 
_B—Minimum input hold specification 


NOTES: 


MAX 


EO AW Ee 


82385 


are specified with minimum and maximum limits, 
which are measured as shown. 82385 input setup 
and hold times are specified as minimums and de- 
fine the smallest acceptable sampling window. With- 
in the sampling window, a synchronous input signal 
must be stable for correct 82385 operation. 


9.4.1 Frequency Dependent Signals 


The 82385 has signals whose output valid delays 
are dependent on the clock frequency. These sig- 
nals are marked in the A.C. Specification Tables with 
a Note 1. 


VALID 


OUTPUT n+1 atest 


. 
\Y 1.sy VALID 1 sy\ NOTE 2 
A?" input |" 
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1. Under rated loading 82385 output (t, and ts) is typically < 4.0 ns from 0.8V to 2.0V. 


2. Input waveforms have t, < 2.0 ns from 0.8V to 2.0V. 


Figure 9-1. Drive Levels and Measurement Points for A.C. Specification 
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A.C. SPECIFICATION TABLES | 
Many of the A.C. Timing parameters are frequency dependent. The frequency dependent A.C. Timing parame- 
ters are guaranteed only at the maximum specified operating frequency. | 
| _ Table 9-2. 82385 A.C. Timing Specifications 
Voc = 5.0 +5% 
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LK2, BCLK2 LowTime @0.8V. 

LK2, BCLK2 Fall Time 
CLK2, BCLK2 Rise Time 
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LOCK # Setup Time 
BE(O-3)# Setup Time 
A20 Setup Time 
A2-A31, BE(0O-3) # LOCK # Hold Time 
M/lO#, D/C # Setup Time 
W/R# Setup Time 
9c ADS # Setup Time 
10 ADS#, D/C#, M/IO#, W/R# Hold Time 

1 READYI# Setup Time | | 

2 READYI# Hold Time 

ti3a1 |NCA# Setup Time (See t55b2) 
t13a2 |NCA# Setup Time (See t55b3) 
t13b LBA# Setup Time 
t13c X16# Setup Time 
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15 |RESET, BRESET Setup Time 

16 RESET, BRESET Hold Time 
ti7 NA # Vaiid Deiay 


= 
BS 
om m*) 


ee ee ee ee 
o;1o!]® NEN on}; 
oO 


— 


Note 8) 
otes 8, 9 
otes 8, 9 
Note 1) 
Note 1) 
Note 1) 
Note 1) 


Zz 


6 

7a 
b 
Cc 

7d: 


a 


an LN ; . 
~~ : 


N 


— =i, : 
io) é) or on o;n = 
bed elo| |? 


a 


a (Note 1) 
Note 1) 


Note 1) 


or 
-_ 


: J oh, 
~“ oo 
wn | 2] 


(Note 1) 


oh, =, 


~r | 
Soe | 


(Note 6) 


(Note 6) 


—k | oh | ot | NO ok N 


coved 
ho 
a 


or 


“I 
on 


~~ 


19.2} ns |(25 pF Load) 


(25 pF Load) 
(Note 1) | 


18. 


READYO# Valid Delay 


-- 


i 
1o%) 

se 
—_ N N 


19 BRDYEN # Valid Delay 


~ 
oO 


— 5-610 


F 


82385 


Table 9-2. 82385 A.C. Timing Specifications (Continued) 
Voc = 5.0 +5% 


Parameter 


Nh | Ph 
pit 
E B 
S Hi 
ow, 

eo 


t21a1 
t21a2 
t21a3 
t21b 


CALEN Falling, PHI1 
CALEN Falling in T1P, PHI2 
ALEN Rising Following CWTH Cycle 
ALEN Pulse Width 
ALEN Rising to CS # Falling 
CWEx# Falling, PHI1 (CWTH) 
WEx# Falling, PHI2 (CRDM) 
CWEx# Pulse Width 
WEx# Rising, PHI1 (CWTH) 
CWEx# Rising, PHI2 (CRDM) 
CS(0-3)# Rising 
COEx# Falling to CS(0-—3) # Falling 
T/R# Valid Delay 
t25a COEx# Falling (Direct) 
t25b COEx# Falling (2-Way) 
t25c1 |COEx# Rising Delay @ Tcase = Min 
_ |t25c2 |COEx# Rising Delay @ Tcase = Max 


t25d CWEx# Falling to COEx# Falling or 
— ICWEx# Rising to COEX # 
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Table 9-2. 82385 A.C. Timing Specifications (Continued) 
oe -  - Vog-= 5.0 +5% 7 | 


7 Parameter 


—_ Min 
BAOE# ValidDelay | 4 | 
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ff 
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BT/R# Valid Delay | | 
DOE# FallingDelay toes ee 
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4 | 30° 
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WBS ValidDelay | 4 | 87 | 4 | 
FLUSHHoldTime || 
NOTES: | | | 


1. Frequency dependent specification. 
2. Used for cache data memory (SRAM) specifications. 
3. Float times not 100% tested. | 
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4. This feature is tested only at 16 MHz. | e 
5. BLOCK # delay is either from BPHI1 or from 386 LOCK #. Refer to Figure 5-3K and 5-3L in the 82385 data sheet. 
6. NCA# setup time is now specified to the rising edge of PHI2 in the state after 386 DX addresses become valid (either the 
_ first T2 or the state after the first T2P). ; 
7. BBE# Valid delay is a function of NCA# setup. . 
8. Not 100% tested. 
9. t5 is measured from 0.8V to 3.7V. 
t6 is measured from 3.7V to 0.8V 
This parameter is not 100% tested and is guaranteed by Baie test methodology. 
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Figure 9-2. CLK2, BCLK2 Timing 


82385 
OUTPUT 


Cy 


290143-49 
Cy indicates all parasitic capacitances. 


Figure 9-3. A.C. Test Load 
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_ Cache Write Hit Cycle 


T1,1P = T2 
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- Cache Read Miss (Cache Update Cycle) 
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Cache Read Cycle 
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*@. This would be 21B if previous bus cycle was Cache Write Hit cycle. 
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System Bus Interface Parameters 
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*This would be 218 if previous cycle was Cache Write Hit. 
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OUTPUT DELAYS 
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System Bus Interface Parameters (Continued) 
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386 DX INTERFACE 
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Signal. Function 


APPENDIX A | 


82385 Signal Summary 
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CWEA#, CWEB# 
LOCAL DECODE 
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_ 386 DX Address Status’. 
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82385 INTERFACE : 


386 DX Local Bus Access 
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82385 Signal Summary (Continued) 


Tri-State 
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DATA/ADDR CONTROL 
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CONFIGURATION 


Define Cache Output Enable 


Snoop Address Bus 
Snoop Strobe 
SEN Snoop Enable 


ARBITRATION 
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RELATED DOCUMENTATION 


This Application Note should be used in conjunction — 


with the 386™ DX microprocessor Data Sheet (Order 
Number 231630-007) and the 386™ DX Hardware 
Reference Manual (Order Number 231732-004). A list 
_ of related references is provided in the appendix for 
getting more information on high speed design issues. 


INTRODUCTION 


The 386™ DX Microprocessor is an advanced 32-bit _ 
microprocessor designed using Intel’s CHMOS IV pro- 


cess for applications which require very high perform- 
ance. It is optimized for multitasking operating sys- 
tems. The 32-bit register and data paths support 32-bit 
address and data types allowing up to four gigabytes of 
physical memory and 64 terabytes of virtual memory to 
be addressed. The integrated memory management and 
protection architecture includes address translation 
registers, advanced multitasking hardware and a pro- 
tection mechanism to support operating systems. In ad- 
dition, the 386 DX microprocessor allows the simulta- 
neous running of DOS with other operating systems. 
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_ Instruction pipelining, on chip address translation and 
- high bus bandwidth ensure short average instruction 


execution times and high system throughput. To facili- 
tate high performance system hardware designs, the 
386 DX microprocessor bus interface offers address 
pipelining, dynamic data bus sizing and direct byte en- 
able signals for each byte of the data bus. 


This Application Note is intended to show how to com- 
plete a successful design of a ’Core’ system using the 
386 DX-33, the 33 MHz clock version. A Core system 
is a minimum system configuration, in this case com- 
prising the CPU, the 82385 32-bit Cache controller, 
Dynamic and Static RAM and an I/O mechanism with 
which to communicate with the CPU. 


The Application Note examines the design techniques 
necessary when executing a design at this frequency. 
Many of the methods used at lower frequencies, such as 
16 MHz and 20 MHz, are no longer valid at this higher 
frequency. Phenomena, whose effects are negligible at 
the lower frequencies, must be taken into account in the 
design. The physical positioning of components relative 


to each other plays a significant part in the success of 


the design, since transmission line effects (reflection, 
radiation) are no longer negligible. 
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Figure 1-2. CLK2 Signal and Internal Processor Clock 
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SECTION II. HIGH SPEED SYSTEM 
DESIGN CONSIDERATIONS 


2.1 Overview Of High Speed Effects 


' This section is included as a brief overview of general 
issues that are applicable to both higher and lower fre- 
quencies of circuit design. 


The CHMOS IV 386 DX CPU differs from previous 
HMOS microprocessors in that its power dissipation is 
primarily capacitive; there is almost no DC power dissi- 
pation. Power dissipation depends mostly on frequency. 
This fact is used in designs where power consumption is 
critical. 


Power dissipation can be distinguished as either inter- 
nal (logic) power or I/O (bus) power. Internal power 
varies with operating frequency and to some extent 
with wait states and software. Internal power increases 
with supply voltage also. Process variations in manu- 
facturing affect internal power, although to a lesser ex- 
tent than with NMOS process: 


I/O power, which accounts for roughly one-fifth of the 
total power dissipation, varies. with frequency and volt- 


age. It also depends on capacitive bus load: Capacitive © 


bus loadings for all output pins are specified in the 386 
DX CPU data sheet. The 386 DX CPU output valid 
delays will increase if these loadings are exceeded. The 
addressing pattern of the software can affect I/O power 
by changing the effective frequency at the address pins. 
The variation in frequency at the data pins tends to be 
smaller; thus varying data patterns should not cause a 
significant change in power dissipation. 


POWER AND GROUND PLANES 


Power and ground planes must be used in 386 DX CPU 
systems to minimize noise. Power and ground lines 
have inherent inductance and capacitance, therefore an 
impedance z = (L/C)*'/. The total characteristic im- 
pedance for the power supply can be reduced by adding 
more lines. This effect is illustrated in 2.1 which shows 
that two lines in parallel have half the impedance of 
one. To reduce the impedance even further, the user 
should add more lines. In the limit, an infinite number 
of parallel lines, or a plane, results in the lowest imped- 
ance. Planes also provide the best distribution of power 
and ground. 
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Figure 2-1. Reducing Characteristic Impedance 


The 386 DX CPU has 20 Vcc pins and 21 Vss 
(ground) pins. All power and ground pins must be con- 
nected to a plane. Ideally, the 386 DX CPU is located 
at the center of the board, to take full advantage of 
these planes. Although the 386 DX CPU generally de- 
mands less power than the 80286, the possibility of 
power surges is increased due to higher frequency and 
pin count. Peak-to-peak noise on Vcc relative to Vss 
should be maintained at no more than 400 mV, ang 
Seon, to no more than 200 mV. 


DECOUPLING CAPACITORS 


The switching activity of one device can propagate to 


other devices through the power supply. For example, 


in the TTL NAND gate of Figure 2.2, both Q3 and Q4 
transistors are on for a short time when the output is 
switching. This increased load causes a negative spike 
on Vcc and a positive spike on ground. 
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In synchronous systems in which many gates switch 
simultaneously, the result is signifcant noise on the 
power and ground lines. | 


Decoupling capacitors placed across the device between 
Vcc and ground reduce Voltage spikes by supplying the 
extra current needed during switching. These capaci- 
tors should be placed close to their devices because the 
inductance or connection lines negates their effect. 


When selecting decoupling capacitors, the user should 

provide 0.01 microfarads for each device and 0. 1 mi- 
crofarads for every 20 gates. Radio-frequency capaci- 240725-5 
tors must be used; they should be distributed evenly 
over the board to be most effective. In addition, the Figure 2-3. Decoupling Chip Capacitors 
‘board should be decoupled from the external supply 
line with a 2.2 microfarad capacitor. 


Chip capacitors (surface-mount) are preferable because 
they exhibit lower inductance and require less total 
board space. They should be connected as in Figure 2.3. 
Leaded capacitors can also be used if the leads are kept 
as short as possible. Six leaded capacitors are required 
to match the effectiveness of one chip capacitor, but 
because only a limited number can fit around the 386 
DX, the configuration in Figure 2.4 results. 


386™ DX CPU 


gang — 


Figure 2-4. Decoupling Leaded Capacitors 
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HIGH FREQUENCY DESIGN CONSIDERATIONS. 


At high signal frequencies, the transmission line prop- 
erties of signal paths in a circuit must be considered. 


Reflections, interference, and noise become significant | 


in comparison to the high-frequency signals. They can 
cause false signal transitions, data errors, and input 
voltage level violations. These errors can be transient 
and therefore difficult to debug. In this section, some 
high-frequency design issues are discussed. Their effects 
and ways to minimize will be introduced in the next 
section. : 


REFLECTION AND LINE TERMINATION 


Input voltage level violations are usually due to voltage 
spikes that raise input voltage levels above the maxi- 
mum limit (overshoot) and below the minimum limit 
(undershoot). These voltage levels can cause excess cur- 
rent on input gates that results in permanent damage to 
the device. Even if no damage occurs, most devices are 
not guaranteed to function as specified if aPDUE voltage 
levels are exceeded. 


Signal lines are terminated to minimize signal reflec- 
tions and prevent overshoot and undershoot. If the 
round-trip signal path delay is greater than the rise time 
‘or fall time of the signal, terminate the line. If the line is 
not terminated, the signal reaches its high or low level 
before reflections have time to dissipate,.and overshoot 
and undershoot occur. There are a few termination 
techniques that are used in different applications, these 
will be discussed in the next section. 


INTERFERENCE 


Interference is the result of electrical activity in one 
conductor causing transient voltages to appear in an- 
_ other conductor. It increases with frequency and close- 
ness of the two conductors. 


There are two types of interference to consider in high 


frequency circuits: electromagnetic interference (EMI) - 


and electrostatic interference (ESI). 


EMI (also called crosstalk) is caused by the magnetic 
field that exists around any current carrying conductor. 
The magnetic flux from one conductor can induce cur- 


rent in another conductor, resulting in transient volt- _ 


age. Several precautions can minimize EMI. 


Running a ground line between two adjacent lines 
wherever they traverse a long section of the circuit 
board. The ground line should be grounded at both 
ends. | 
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Running ground line between the lines of an address — 
bus or a data bus if either of the following conditions 
exist. 


— The bus is on an external layer of the board. 


_ — The bus is on an internal layer but not sandwiched 


between power and ground planes neh are at most 
10 mils away. 


Avoiding closed loops in signal paths (see Figure 2.5). 
Closed loops cause excessive current and create induc- 
tive noisé,.especially in the circuitry enclosed by a loop. 
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Figure 2-5. Avoid Cidsad:loop Signal Paths 


ESI is caused by the capacitive coupling of two adjacent 
conductors. The conductors act as the plates of a capac- 
itor; a charge built up on one induces the Opposite 
charge on the other. 


The following steps reduce ESI: © 


Separating signal lines so that capacitive coupling be- 
comes negligible. 


Running a ground line between two lines to cancel the 
electrostatic fields. | 


LATCHUP 


Latchup is a condition in a CMOS circuit in which 
Vcc becomes shorted to Vss. Intel’s CHMOS IV pro- 
cess is immune to latchup under normal operating con- 
ditions. Latchup can be triggered when the voltage lim- 


_ its on I/O pins are exceeded, causing internal PN junc- 


tions to become forward biased. The following guide- 
lines help prevent latchup: 


Observing the maximum rating for input voltage on 
I/O pins. 7 


Never applying power to an 386 DX CPU pin or a 


device connected to an 386 DX CPU pin before apply- 
ing power to the 386 DX CPU itself. 


Preventing overshoot and undershoot on I/O pins by 


adding line termination and by designing to reduce 


noise and reflection on signal lines. 
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THERMAL CHARACTERISTICS 


The thermal specification for the 386 DX CPU defines 
the maximum case temperature. This section describes 
how to ensure that an 386 DX CPU system meets this 
specification. 


Thermal specifications for the 386 DX CPU are de- 
signed to guarantee a tolerable temperature at the sur- 
face of the 386 DX CPU chip. This temperature (called 
the junction temperature) can be determined from ex- 
ternal measurements using the known thermal charact- 
cristics of the package. Two equations for calculating 
junction temperature are as follows: 


T, = Ta + (@jg * PD) and 
T) = Te + (@ig * PD) 


where: 
T; = Junction Temperature 
@j, = Junction to ambient temperature coeff. 
T, = Case Temperature 
T, = Ambient Temperature 
@j, = Junction to Case 
PD = Power Dissapation temperature coeff. 


Case temperature calculations offer several advantages 
over ambient temperature calculations. 


Case temperature is easier to measure accurately than 
ambient temperature because the measurement is local- 
ized to a single point (top center of the package). 


The worst-case junction temperature (Tj) is lower when 
calculated with case temperature for the following rea- 
sons: 


— The junction-to-case thermal coefficient (@j,) is 
lower than the junction-to-ambient thermal coeffi- 
cient (@j,); therefore, calculated junction tempera- 
ture varies less with power dissipation (PD). 


— @j, is not affected by airflow in the system; @j, 
varies with air flow. 


With the case-temperature specification, the designer 
can either set the ambient temperature or use fans to 
control case temperature. Finned heat sinks or conduc- 
tive cooling may also be used in environments where 
the use of fans is precluded. To approximate the case 
temperature for various environments, the two equa- 
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tions above should be combined by setting the junction 
temperature equal for both, resulting in this equation: 


Ta = Tc — (@ia — @ic) * PD) 


The current data sheet should be consulted to deter- 
mine the values of @ja (for the system’s air flow) and 
ambient temperature that will yield the desired case 
temperature. Whatever the conditions are, the case 
temperature is easy to verify. 


2.2 Transmission Line Effects 


As a general rule, any interconnection is considered a 
transmission line when the time required for the signal 
to travel the length of the interconnection is greater 
than one-eighth of the signal rise time. (True K. M. , 
“Reflection: Computations and Waveforms, The Inter- 
face Handbook”, Fairchild Corp, Mountain View, CA, 
1975, Ch. 3). As frequencies increase, designers must 
account for the negative effects associated with trans- 
mission lines. The section that follows will attempt to 
describe these effects and provide some suggestions for 
minimizing their negative effect on the system. 


Before describing each effect, it is important to know 
how to characterize a trace on different types of trans- 
mision lines. This includes knowing the characteristic 
impedance of a trace, Z,, and the propagation delay for 
a given trace, tpg. These parameters will be used in 
determining what effects must be accounted for and to 
select component values used in minimizing the effects. 


_ TRANSMISSION LINES TYPES 


Although many types of transmission lines (conduc- 
tors) exist, those most commonly used on the printed 
circuit boards are microstrip lines, strip lines, printed 
circuit traces, side-by-side conductors and flat conduc- 
tors. 


MICRO STRIP LINES 


The micro strip trace consists of a signal plane that is 
seperated from a ground plane by a dielectric as shown 
in Figure 2.6. G-10 fiber-glass epoxy, which is most 
common, has an e, = 5 where e, is the dielctric con- 
stant of the insulation. Let: 


w = the width of the signal line (inches) 
t = the thickness of copper 


h = the height of dielectric for controlled imped- 
ance (inches) 
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The characteristic impedance Zo, is a function of dielec- 
tric constant and the geometry of the board: This is 
given by: 


Zo = (87/(e, + 1.41)¥2 In (5.98/0.8 w + t) 2 


where e, is the relative dielectric constant of the board 
material. 


The propagation delay (tpa) associated with the trace is 
_a function of the dielectric only. 


tog = 1.017 (0.475e, + 0.67) % ns/ft 


STRIP LINES 


A strip line is a strip conductor centered in a dielectric 
medium between two voltage planes. The characteristic 
impedance is given by: 


Zo = 60/(e,)% In (5.98b/(0.8W + t)) 2 
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where b = distance between the planes for the con- 


- trolled impedance as shown in Figure 2.10 


The propagation delay is given by: | 
tod = 1.017 (e,) % ns/ft 


Typical values of the characteristic impedance and 
propagation delay of these types of lines are: 


Zo = 500 
tog = 2 ns/ft (or 6 in/ns) 


2.3 Reflection 


The first effect is reflection. As the name indicates it is 
the reflection of a signal as it propagates down the 
trace. The reflection results from-a mismatch in imped- 
ance. The impedance of a transmission line is a function 
of the geometry of the line, its distance from the ground 
plane, and the loads long the line. Any discontinuity in 
the impedance will cause reflections. 


Ground 
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Figure 2-7. Strip Lines 
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Impedance mismatch occurs between the transmission 
line characteristic impedance and the input or output 
impedance of the devices that are connected to the line. 
The result is that the signals are reflected back and 
forth on the line. These reflections can attentuate or 
reinforce the signal depending upon the phase relation- 
ships. The results of these reflections include overshoot, 
undershoot, ringing and other undesirable effects. 


At lower edge rates, the effects of these reflections are 
not severe. However at higher rates, the rise time of the 
signal is short with respect to the propagation delay. 
Thus it can cause problems as shown in Figure 2-8. 


Overshoot occurs when the voltage level exceeds the 
maximum (upper) limit of the output voltage, while un- 
dershoot occurs when the level passes below the mini- 
mum (lower) limit. These conditions can cause excess 
current on the input gates which results in permanent 
damage to the device. 


The amount of reflection voltage can be easily calculat- 
ed. Figure 2-9 shows a system exhibiting reflections. 


© 
i) 
o 
= 
° 
> 


| Overshoot 
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The magnitude of a reflection is usually represented in 
terms of a reflection coefficient. This is illustrated in 
the following equations: : 


T = v,/v; = Reflected voltage/Incident voltage 
Tload = (Zioad ~ Zo)/(Zioad + Zo) 
Tsource = (Zsource ~ 20)/(Zsource + Zo) 


Reflections voltage V, is given by Vj, the voltage inci- 
dent at the point of the reflections, and the reflection 
coefficient. 


The model transmission line can now be completed. In 
Figure 2-9, the voltage seen at point A is given by the 
following equation: 


Va = Vs * Zo/(Zo + Ze) 


This voltage V, enters the transmission line at “A” and 
appears at “B” delayed by tpg. 


Expected Output 
Signal 


Output Signal 
Received 


* 2 
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Figure 2-8. Overshoot and Undershoot Effects 
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Figure 2-9. Loaded Transmission Line | 
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Vp(t — x/t) H(t — x/v) 


where x = distance along the ‘transmission line from 


point “A” and H(t) is the unit step function. The wave- 


form encounters the loads ZL, and this may cause re- 
flection. The reflected wave enters the transmission line 
at “B” and appears at point “A” after time delay (tpg): 


Vet = Tioad * Vb 


This phenomenon continues infinitely, but it is negligi- 
ble after 3 or 4 reflections. Hence: 


Vr2 = Tsource * Vr1 


Each reflected waveform is treated as a seperate source 
that is independent of the reflection coefficient at that 
point and the incident waveform. Thus the waveform 
from any point and on the transmission line and at any 
given time is as follows: 


V(x,t) = (Zo/(Zo+Zs)) {Vg(t—(x/v)) H(t— («/v)) + 
Ty [Vg(t—((2L —x)/v) H(t—(t—((2L —x)/v)))] + 
T4Ts [Vg(t— ((2L + x)/v) H(t— (t— ((2L +x) /v))] + 
Ty2Ts [Vg(t—((4L—x)/v) H(t—(t— ((4L—x)/v)))] + 
Ty2Ts? [Vg(t— ((4L + x)/v) H(t— (t—((4L + x)/v)))] 
He Line 


Each reflection is added to the total voltage through the 
unit step function H(t). The above equation can be re- 
written as follows: 


V(x) = (Zo/(Zo+Zg)) {Vg(t— (t= tpgx) H(t—tpgx) + 
T1 [Ve(t—tpg(2L —x)) H(t—tpg(2L —x))] + 
T1Ts [Ve(t—tpg(2L +x)) H(t—tpg(2L+x))} +... 3 


Impedance discontinuity problems are managed by im- 
posing limits and control during the routing phase of 
the design. Design rules must be observed to control 
trace geometry, including specification of the trace 


width and spacing for each layer. This is very impor- 


tant because it ensures the traces are smooth and con- 
stant without sharp turns. 


HOW TO MINIMIZE 


There are several techniques which can be employed to 
further minimize the effects caused by an impedance 
_ mismatch during the layout process: 


1. Impedance Matching 

2. Daisy Chaining 

3. Avoid 90° Corners - 

4. Minimize the Number of Vias 
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IMPEDANCE MATCHING | 


Impedance matching is the process of matching the im- 
pedance of the the source or load to the impedance of 
the trace. This matching is accomplished using a tech- 
nique called termination. Termination makes the effec- 
tive source or load impedance, seen by the trace, to be 
approximately equal to the characteristic impedance of 
the trace. Before terminating a line one must determine 
if termination is required. This is done by a simple cal- 
culation. If the propagation delay down a trace from 


source to destination is greater than or equal to one- 


third the signals rise time, termination is needed. (i. e. 
Tpa 2 '/s ty). The rise time is the 0%-100% rise time 
specified for the source. If this value is specified for 
10%-90% or 20%-80%, it must be scaled by multiply- 
ing the specified value by 1.25 or 1.67, respectively. The 
propagation delay is caculated by multiplying the trace 
propagation oy tp, descibed earlier by the trace 


length. 


Once it is determined that termination is needed, use 
the equation described earlier to calculate the trace’s 
characteristic impedance. The specification sheets for 
the load can be consulted to determine the load imped- 
ance, Z;. These values are needed to select the compo- 


‘nent values used to terminate. 


- The next chore is selecting the type of termination to 


use. In this section we will examine 4 different tech- 
niques and point out the advantages and disadvantages. 
Figure 2.10 shows the four types of termination and the 
corresponding component values. 


Parallel termination, shown in Figure 2-10(a), is a good 
technique to maintain the waveform. The waveform at 
the load is a perfect image of the waveform at the 
source. In addition there is no added propagation delay 
associated with this technique. The disadvantage of this 
technique is that it requires a fair amount of additional 
power and it is not suggested for characteristic imped- 


~ ances of less than 100 ohms because of the large d.c. 


current required. 


Thevenin termination, shown in Figure 2-10(b), is an- 
other option. This technique also requires a large 
amount of power, but does not have the restrictions for 
characteristic impedance. This technique is very good 
at removing overshoot and undershoot while not add- 
ing any additional delay. Another advantage is that the 
trace can be biased toward Vcc or GND by simpling 
selecting the appropriate resistor values. This can help 
maintain fast edges on important signal transitions. 
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Parallel Waveform at receiver is almost High power dissipation 
perfect image of input Zo = 1009, else D.C. 
Bipolar/ Advanced CMOS current limit 


No added Tpp 


Thevenin Good overshoot and undershoot High power dissipation 
suppression 


Bipolar or Bipolar/CMOS systems 
No added Tpp 


Figure 2-10(a). Termination Techniques 


| Name | Circuitry Advantages Disadvantages 


Series Low power consumption Added Tpp 
CMOS—CMOS Systems 
Easy to adjust signal 

amplitude to match 
switching threshold 


Low—medium power 
dissipation (capacitor 
blocks D.C. coupling of 
signal) 

No added delays 
High-speed CMOS families 


Two added components 


R = Zo, C = 200 pF-500 pF 
Figure 2-10(b). Termination Techniques 


5-631 


intel 


Series termination, naw in Figure 2-10(b), is a very 
easy technique of matching impedance. It only requires 
on resistor and very little additional power is required. 
In addition the resistor value can be selected to provide 
constructive or destructive reflections and thus alter the 


signal amplitude to match the switching threshold. The 


major disadvantage of this technique is the added delay 
it introduces. 


The fourth technique is A.C. termination, shown in 
Figure 2-10(b). It requires a small amount of additional 


power, this is decreased over parallel termination by the — 


introduction of the capacitor, and adds no extra delay 
to the path. The major disadvantage is that it requires 
. two extra components. 


After examing the systems needs and selecting a termi- 
nation technique, the impedance values determined ear- 
lier, Zo and Zz, can be used to determine the compo- 
nent values to implement the termination. These values 
should be seen as a starting point and may be altered to 
remove a specific problem experienced on a signal or to 
bias signals in an appropriate fashion. 


Source 
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DAISY CHAINING 


Another technique of minimizing reflections is to daisy- 
chain signals, shown in Figure 2-11. This means to run 
a single trace from a source and to distribute the loads 
along this trace. The alternative is to run multiple 
traces from the source to each load. Each trace will 
have reflections of its own and these will be transmitted 
down the other traces once they have returned to the 
source. To manage such a system separate termination 
would be required for each branch. To eliminate these 
multiple terminators from T-connections, high frequen- 
cy designs are routed as daisy chains. 


Because each gate provides its own impedance load 
along the chain, it is necessary to distribute these loads 
evenly along the length of the chain. Hence, the imped- 
ance along the chain will change in a series of steps and 
is easier to match. The overall speed of this line is faster 
and predictable. Also all loads should be placed at 
equal distances (regular intervals). 


90 DEGREE ANGLES 


Eliminating 90° angles also minimizes reflections. It is 
much more desirable to use 45° or 135° angles # as shown 
in Pigute 2-12. 
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Figure 2-11. Daisy Chaining 


Receiver 


Driver 
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Receiver 


Driver 
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Figure 2-12. Avoiding 90 Degree Angles 
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VIAS (FEED THROUGH CONNECTIONS) 


Another impedance source that degrades high frequen- 
cy circuit performance is the via. Expert layout tech- 
niques can reduce vias to avoid reflection sites on 
PCBs. 


Following these guidelines will not guarantee elimina- 
tion of all reflections, but they will minimize the num- 
ber and size. 


2.4 Cross Talk 


Cross talk is another negative effect of transmission 
lines. It is a problem at high frequencies because, as 
operating frequency increase, the signal wavelength be- 
come comparable to the length of the interconnections 
on the PC board. In general, interference such as cross 
talk, occurs when electrical activity in one conductor 
causes a transient voltage to appear in another conduc- 
tor. Main factors that increase interference in any cir- 
cuit are: 


1. Variation of current and voltage in the lines causes 
frequency interference. This interence increases with 
increase in frequency. 


2. Coupling occurs when conductors are in close prox- 
imity. | 


Cross talk is the phenomenom of a signal in one trace 
producing a similar signal in an adjacent trace. It may 
not be a carbon copy of the original signal. It may only 
be occasional noise that corrupts the integrity of the 
second signal. The easiest way to minimize crosstalk is 
to eliminate or at least minimize the number of parallel 
traces. Parallel traces can be on a single layer or on 
adjacent signal layers. 


There are three ways that parallel traces can couple and 
thereby produce a signal or at least influence the signal 
on a second trace. These methods of coupling are in- 
ductive, radiative, and capacitive. Inductive coupling is 
where the two traces act as inductors. The field pro- 
duced by a signal in one trace induces a current in the 
second trace. Radiative coupling occurs when the two 
parallel traces act as a dipole, an antenna. One radiates 
a signal and the other receives it, thus corupting the 
signal already present on the trace. The final method is 
capacitive coupling. Two parallel traces separated by a 
dielectric act as a capacitor. If both traces are in a high 
State and one transitions to a low. The capacitor will try 
to maintain the high and thus cause a slow transition 
time on the second trace. These effects can be mini- 
mized by reducing the number of parallel traces. 


AP-442 


HOW TO MINIMIZE 


When laying out a board for an high speed 386 DX 
based system, several guidelines should be followed to 
minimize crosstalk. Some of them are as follows: 


1. To reduce crosstalk, it is necesary to minimize the 
common impedance paths. 


2. Run a ground line between two adjacent lines. The 
lines should be grounded at both ends. 


3. Seperate the address and data busses by a ground 
line. This technique may however be expensive due 
to large number of address and data lines. 


4. Remove closed loop signal paths which create induc- 
tive noise. 


5. Capacitive coupling can be reduced by reducing the 
number of parellel traces. Parallel traces can be mini- 
mized by insuring that signals on adjacent signal lay- 
ers run orthogonal, perpendicular. Ground planes or 
traces can be inserted to provide shielding. A ground 
plane between signal layers eliminates any coupling 
that could occur. On a single trace, a ground trace 
can be run between traces to prevent coupling. 


In some instances it is necessary to run traces parallel 
to each other. In these cases try to make the distance as 
short as possible and choose signals in which the tran- 
sition time is not as critical so that the coupling effects 
do not produce problems. In addition the coupling can 
be minimized by increasing the spacing between paral- 
lel traces. | : 


2.5 Skew 


Skew is another effect of transmission lines. This is very 
important in a synchronous system. Long traces add 
propagation delay. A longer trace or a load placed fur- 
ther down a trace will experience more delay than a 
short trace or loads very close to the source. This must 
be taken into account when doing the worst case timing 
analysis. In a system where events must occur synchro- 
nous to a clock signal, it is important to make sure the 
signal is available to all input a sufficient amount of 
time prior to the corresponding clock edge. When per- 
forming the component placemient this is one of the 
considerations that must be accounted for. 


These guidelines have always been recommended for 
board design; however, they are much more important 
at higher frequencies. At the slower frequencies design- 
ers could ignore these practices occassionally and not 
experience difficulties. This is not the case at higher 
frequencies. | 
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2.6 DC Loading 


To. maintain proper logic levels, all digital signal out- 
puts have a maximum load, they are capable of driving. 
DC loading is the constant current required by an input 
in either the high or the low state. It limits the ability of 
a device driving the bus to maintain proper logic levels. 
For a 386 DX based system, a careful analysis must be 
performed to ensure that in a worst case situation no 
loading limits are exceeded. Even if a bus is loaded 
slightly beyond its worst case limit, it might cause prob- 
lems if a batch of parts whose: input loading is close to 
maximum is encountered. Proper logic level will then 
fail to be maintained and unreliable operation may re- 
sult. Marginal loading problems are particularly insidi- 
ous, since the effect is often erratic operation and non 
repetitive errors that are extremely difficult to track 
down. For both the high and low logic levels, the sum 
of the currents required by all the inputs and the leak- 
age currents of all outputs (drivers) on the bus must be 
added together. This sum must be less than the output 
capability of the weakest driver. Since the 386 DX is a 
CHMOS device having negligible dc loading, the main 
contributors to dc loading will be the TTL devices. 


2.7 AC Loading 


The AC or capacitive loading is caused by the input 
capacitance of each device and limits the speed at 
which a device driving a bus signal can change the state 
from high to low or low to high. Designers of micro- 
processor systems have traditionally calculated load ca- 
pacitance of their systems by determining. the number 
of devices and their individual capacitance loading at- 
tached to a signal plus the amount of trace capacitance. 
Typically, the trace capacitance was a set “lumped” 
number of pf (i.e. 2 pf to 3 pf per inch) when it is 
thought of at all. This lumped method is a general rule- 
of-thumb which generates a good first pass approxima- 
tion. For low frequency designs, the lumped method 


works since system and component margins are large 


enough to cover any minor differences due to the ap- 
proximation. 


For high frequency designs, the component and system 
margins are no longer available to the designer. With 
less than 1 ns of margin, even the amount of trace ca- 
pacitance can make a circuit path critical. 


A more accurate calculation of capacitive loading can 
be derived by modeling the device loads and system 
traces as a series of Transmission Lines Theory. Trans- 
mission Line Theory provides a more accurate picture 
of system loading in high frequency systems. In addi- 
tion, it allows new factors such as inductance and the 
effect of reflections upon the quality of the signal wave- 
form to be factored into consideration. 
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2.8 Derating Curve and Its Effects: 


A derating curve is a graph that plots the output buffer 
against the capacitive load. The curve is used to analyze 
a signal delay without necessitating a simulation every 
time the processor’s loading changes. This graph as- 
sumes the lumped capacitance model to calculate the 
total capacitance. The delay in the graph should be 
added to the specified AC timing value for the device 
that is driving the load. The derating curve is different 
for different devices because each device has different 
output buffers. 


A derating curve is generated by tying the chip’s output 

buffers to a range of capacitors. The voltage and resist- 
ance values chosen for the output buffers are at the 
highest specified temperature and are rising (worst 
case) values. The value of the capacitors centres around 
the AC timing values for the chip. For 33 MHz and 
above, this is 50 pF. Since the AC timing specifications 
are measured for a signal reaching 1.5 V. A curve is 
then drawn from kthe range of time and capacitance 
values, with 50 pF representing the average and with 
nominal or zero derating. These curves are valid only 
for 50 pF-—150 pF load range. Beyond this range the 
output buffers are not characterized. The the derating 
curve for the 386 DX are shown in 2-13. These curves 
usé the lumped capacitance model for circuit capaci- 


tance measurements and must be modified slightly 


when doing worst-case calculations that involve trans- 


- mission line effects. 
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shown. 


Figure 2-13. Typical Output Valid Delay 
Versus Load Capacitance at Maximum 
Operating Temperature (C, = 120 pF) 
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2.9 High Speed Clock Circuits 


For performance at high frequencies, the clock signal 
(CLK2) for the 386 DX CPU must be free of noise and 
within the specifcations listed in the 386 DX CPU data 
sheet. Achieving the proper clock routing around a 
33 MHz printed circuit board is delicate because a myr- 
_ iad of problems, some of them subtle, can arise design 
guidelines are not followed. For example, fast clock 
edges cause reflections from high impedance termina- 
tions. These reflections can cause significant signal deg- 
radation in systems operating at 33 MHz clock rates. 
This section covers some design guidelines which 
should be observed to properly lay out the clock lines 
for efficient 386 DX operation. 


® Since the rise/fall time of the clock signal is typical- 
ly in the range of 2-4 ns, the reflections at this speed 
could result in undesirable noise and unacceptable 
signal degradation. The degree of reflections de- 
pends on the impedance of the traces of the clock 
connections. These reflections can be optimized by 
terminating the CLK2 output with proper termina- 
tions and by keeping length of the traces as short as 
possible. The preferred method is to connect all of 
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the loads via a single trace as shown in Figure 2-14, 
thus avoiding the extra stubs associated with each 
load. The loads should be as close to one another as 
possible. Multiple clock sources should be for dis- 
tributed loads. 


A less desirable method is the star connection layout 
in which the clock traces branch to the load as close- 
ly as posssible (Figure 2-15). In this layout, the stubs 
should be kept as short as possible. The maximum 
allowable length of the traces depends upon the fre- 
quency and the total fanout, but the length of all the 
traces in the star connection should be equal. 
Lengths of less than one inch are recommended. In 
this method the CLK2 signal is terminated by a se- 
ries resistor. The resistor value is calculated by mea- 
suring the total capacitive load on the CLK2 signal 
and referring to Figure 2-16. If the total capacitive 
load is less than 80 pF, the user should add capaci- 
tors to make up the diference. Because of the high 
frequency of CLK2, the terminating resistor must 
have low inductance; carbon resistors are recom- 


‘mended. 


Use an oscilloscope to compare the CLK2 waveform 
with those in Figure 2-17. 


Thevenin's 
Termination 
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Figure 2-14. Clock Routing 
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Figure 2-15. Star Connection 
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Figure 2-16. CLK2 Series Termination 
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Figure 2-17. CLK2 Waveforms 
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SECTION III. DESIGN EXAMPLE 


At higher processor speeds the window of time avail- 
able to perform specific tasks become very small. This 
window can be equated to multiples of the CLK2 peri- 
od. Within this time signals must be supplied from a 
source and reach a destination in time to meet any set- 
up requirements. At 16 MHz the CLK2 period is 31 ns. 
At 33 MHz it shrinks to half this value, 15 ns. The 
longer time allowed the use of slower logic families and 
the delays associated with longer traces. As the window 
decreases system designers have to practice more care 
in the selection of logic families and in the choices 
made for component placement and signal routing on 
PCBs. This section attempts to list the signal paths 


whose worst case timing analysis results in very small | 


margins and therefore require closer attention from de- 
signers to guarantee that all a. c. timing specifications 
are met. 


This section also includes a sample design based on 
33 MHz version of the 386 DX. It should not be taken 
as a recommended design. The circuit is used only to 
highlight the design considerations for high speed sys- 
tems. 


3.1 System Architecture | 


Figure 3.1 shows the system block diagram. It has four 
major subsystems. 


1) CPU subsystem 

2) DRAM subsystem 

3) Cache subsystem 

4) ROM and I/O subsystem 


The system has 1 megabytes of Page-Mode DRAMS 
(60 ns RAS access time), 128 kilobytes of EPROMS 
(200 ns access time), an 8259A-2, and an 82510. The 
cache subsystem is optional. Schematics and PAL 
codes are given in appendix A and B respectively. 


3.2 CPU Subsystem 


The CPU subsystem consists of the 386 DX microproc- 
essor, a clock and reset circuitry, and bus control logic. 
Clean and proper clock is very important in the designs 
at high frequencies. 


RESET STATE MACHINE 


This state machine is used to generate three control 
signals, namely RESET, REFREQ and CLK. The 
CLK signal is half of the CPU clock, CLK2 and is used 
_ mainly in I/O and EPROM subsystem. 
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RESET is generated through the input from RESET 
triggering circuitry (as shown in the CPU schematic). 
The min RESET Setup and Hold time for operation at 
33 MHz are 5 ns and 2 ns respectively. 


A 61.44 KHz clock is used to produce a synchronous 
refresh request (REFREQ) signal for the DRAM con- 
troller, which employ a transparent, distributed, 
DRAM refresh technique that allows the processor and 
cache to run while the refresh cycle is in progress. 


3.3 DRAM Subsystem 


An non-interleaved DRAM system is used in the sam- 
ple board, which simplifies the design. Since the board 
provide caching, the performance of DRAM subsystem 
is outweighed by the simplicity and economy of the 
design. It employs a transparent, distributed, DRAM 
refresh technique which allows the processor and cache 
to run while the refresh cycle is in progress. It uses the 
3-state capability of the 16R8-7 and the 74ACT258 to 
multiplex the refresh address. A further consideration 
is the choice of DRAM devices. If one uses a memory 
device such as the AAA2801 (which supports a CAS # 
before RAS# refresh and provides an internal refresh 
counter) further simplifications can be made in both the 
circuitry and the control logic. 


DRAM CONTROL STATE MACHINE 


The state machine is implemented with three 16R8- 
type E-speed PALs (see page 4 of the schematics). 
E-speed PALs must be used since the CLK2 frequency, 
66.67 MHz, is higher than the maximum clock frequen- 
cy of the D-speed PALs. 


In order to generate DRAM control signals with small- 
est delay from the CLK2 edges, all state machines are 
implemented as Moore machines. The state machines 
flip-flops generate most of the DRAM control signals 
directly. This is an expensive design approach in terms 
of hardware but allows signal timings and skews to be 
fine tuned. 


DRAM CYCLES—NO CACHE CONFIGURATION 


Pages C-1 through C-4 show examples of DRAM cy- 
cles. In order to hide the DRAM page hit-or-miss deci- 
sion time, the DRAM controller always tries to put the 
386 DX in pipelined mode. The first read cycle requires 
only two wait states since RAS# has been precharged - 
(see page C-1). The second cycle takes only two clock 
cycles. The second cycle is a pipelined, page-hit read 
cycle, which is the best case. The third cycle is a pipe- 
lined, page-hit write cycle. This cycle requires one wait 
state. DRAMs capture data at the falling edge of 
CAS# during Early Write cycles. The 386 DX drives 
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valid write data at the rising edge in the middle of Tip 
(edge C) with a max prop delay of 24 ns (T12 max). 
This means that the CAS # is generated after the rising 
edge in the middle of the second T2p (edge A). CAS # 
is, therefore, generated at the end of RAS# hold time 
with respect to CAS# (if the next cycle is a page miss, 
RAS#¥ will go inactive at the end of the current write 
cycle), and so on. 


The fifth cycle is a page miss, which is actually detected 
at the end of the fourth cycle (page C-2). Since the 
DRAM controller must wait for minimum RAS # pre- 
charge time, the fifth cycle requires three wait states. 
The sixth cycle is also a page miss. This cycle, however, 
requires only two wait states because the miss was de- 
tected early enough in the previous cycle to have 
RAS# precharged by the end of the TIp. If the seventh 
cycle is another page miss, it will require three wait 
states. 


The eighth cycle is ended with T2i. Consequently, the 
ninth cycle must wait for minimum RAS# precharge 
time and requires three wait states. 


A DRAM refresh cycle is shown on page C-4. The 
DRAM address multiplexer output is disabled, and the 
refresh address counter output is enabled. The cycle 
does a RAS# only refresh cycle where only RAS# is 
asserted with a proper refresh address. After the refresh 
cycle is completed, a read cycle which has been sus- 
pended due to the refresh is resumed. oo 


STATE DIAGRAMS 


Pages B-1 through B-11 show state diagrams of the 
DRAM controller. The precharge state machine on 
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page B-2 measures the required RAS# precharge time 
and CAS#-to-RAS# precharge time. The CAS#- 
READY # state machine on page B-2 implements a pin 
strap option of having or not having the 82385. For no 
cache configuration, the Cache variable must be forced 
low. | 


TIMING CALCULATIONS 


Timing equations are described on pages D-1 and D-2. 
Their corresponding results are given on pages D-3 
through D-7. 


Capacitive load on the 386 DX address bus was as- 
sumed to be less than 85 pF. Capacitive load on the 
DRAM address bus was calculated to be less than 
22 pF. | . 


3.4 CACHE Subsystem 


At 33 MHz DRAM speeds are not fast enough to de- 
sign zero wait state memory systems. A cache can be 
used to take advantage of the higher performance avail- 
able from the higher speed 386 DX microprocessors. 
The cache takes advantage of the faster SRAM while 
keeping system costs down by using the cheaper but 
slower DRAMs. : 


Details of the cache subsystem are shown on Figure 3.2 
and 3.3. The 82385 address and data busses are inter- 
faced to the 386 DX address and data busses via 
74AS574s and 74AS646s. Static RAMs (20 ns access 
time) are used for the cache memory. 
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Figure 3-2. Block Diagram of Cache Subsystem 
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Figure 3-3. Address Valid Delay for Cache Subsystem 
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In selecting SRAM there are several types one can 
choose to use. Some SRAM require a latch for the ad- 
dress and a transceiver for the data. Others have an 
OE#, output enable, signal and-incorporate the trans- 
ceiver on chip. The third type is called integrated 
SRAM and these contain both the latch and the trans- 
ceiver on chip. However, there are two timing paths 
that dictate the speed selection within each type. Figure 
3.4 shows a typical system configuration using each 
type. 


Address Bus 


Data Bus_ 


Address Bus 
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_ Figure 3.4(b) SRAM with OE # Control 
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Figure 3-4. (c) integrated SRAM 


The critical times for the SRAM are the SRAM OE# 
to data delay and the SRAM address to data delay. The 
following analysis applies to SRAMs with an OE# sig- 
nal as shown in Figure 3.4b. First examine the path of 
OE# to data. This path must be completed within 2 
CLK periods. The COE# signal from the 385 Cache 
Controller must be valid and the SRAM must drive 
data onto the data bus so that the data setup time of the 
386 DX CPU is met. | 


2 X CLK2 period - tz5, 82385 COE# valid delay 
(max) - SRAM access time (OE# to data) - tz; 386 DX 
data setup = 0 


Using the specified values from the data sheets reveals 
that the SRAM must have an OE# to data delay of 
10ns or less. The other path is for the address to be- 
come available and data to reach the 386 DX CPU. 
This path has 4 CLK2 periods. The 385 Cache Control- 
ler must supply the CALEN signal to pass the address 
to the SRAM and then the SRAM must drive the data 
on the data bus so that the data setup time is met on the 
386 DX CPU. 


4 X CLK2 period - t24, 82385 CALEN valid delay 
(max) - tpg (x373 latch) - SRAM access time (address 
to data) - tz; 386 DX data setup = 0 


Once again using the data sheet the access time can be 
determined. Depending on the type of transparent latch 
the SRAM needs an address to data access time of 20ns 
or 25ns. If an F series 373 is used the faster 20ns 
SRAM must be used, but if an FCT373a or PCT373a is 

used the 25ns SRAM is sufficient. | 


The Ay path is another path with a small margin. The 
reason is the AND gate that many designers insert to 
provide 1MB wraparound of address in real mode. Fig- 
ure 3.5 shows the circuit block diagram. Aj must leave 
the 386 DX and reach the 385 Cache Controller within 
2 CLK2 periods. 
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2 X CLK2 period - tg 386 DX address valid delay (max) - ttp AND prop. delay - t7g 82385 address setup = 0 


Figure 3-5. Critical Timing A20 


To meet this timing the propagation delay of the AND 
gate must be less than 6ns. This dictates the use of a 
_ 74AS08 gate or faster device. 


Analysis of the LOCK # path also shows a small mar- 


gin. The reason is the OR gate that many designers 


insert to disable the LOCK # signal to the 385 Cache 
Controller. This allows locked accesses to be cached. 
Figure 3.6 shows the circuit block diagram. LOCK # 
must leave the 386 DX and reach the 385 Cache Con- 
troller within 2 CLK2 periods. 


74AS32 
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2 X CLK2 period - tg 386 DX LOCK # valid delay (max) - typ OR prop. delay - t7p 82385 LOCK # setup = 0 


Figure 3-6. Critical Timing Lock # 
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To meet this timing the propagation delay of the OR 
gate must be less than 6ns. This dictates the use of a 
74AS32 gate or faster device. : 


The final path examined here is the NA# path. Re- 
cently designers have selected to use an I/O port and 
an OR gate to disable pipelining selectively. Figure 3.7 
shows the circuit block diagram. NA# must leave the 
386 DX and reach the 385 Cache Controller within 2 
CLK2 periods. 


Using the specified values in the appropriate data sheets . 


results in the need for the propagation delay of the OR 
gate must be no greater than 5.8ns. This dictates the 
use of a 74AS32 gate or faster device. 
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This list is not meant to be exhaustive. It is merely 


_ meant to highlight a few of the critical timings. Each 
_ designer should perform a thorough timing analysis of 

_ the system they are designing to verify that all timing 
‘requirements are met. 


In addition to the specified timing parameters in the 


data sheets, designers should account for propagation 
_ delays introduced by the trace and by capacitive load- . 


ing. The propagation delay added by the trace is ex- 
plained in the section on transmission line effects and 
supplies an equation to determine the amount of delay. 


74AS32 
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2 X CLK2 period - ty7 386 Dx NA# valid delay (max) - tt) OR prop. delay - ty5 82385 NA# setup = 0 
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Figure 3-7. Critical Timing NA# 
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Another factor that becomes more important at higher 
frequencies is loading. DC loading and especially ca- 
pacitive loading must be considered during the design 
stage. If the board is to be assembled and tested in 
stages, then the DC loads should be considered for all 
configurations of the board. Most termination tech- 
niques require additional current. If a board has a mar- 
ginal loading situation, one is limited in one’s choices of 
termination techniques. If a capacitive loading problem 
exists, the timing situations can become extremely diffi- 
cult at higher frequencies. If timing is critical, do not 
overload the capacitance at which a device was tested. 
If a device is overloaded, derating must be taken into 
consideration. 


Capacitive loading also introduces a delay on signals. 
Many components including the 386 DX include a ca- 
pacitive derating curve in the data sheet. To use the 
curve in the 386 DX data sheet, the capacitive load 
must be calculated. This is done by summing the input 
capacitances of all devices driven by a given output 
from the 386 Microprocessor. Find this value on the X- 
axis of the derating curve in the data sheet and move up 
till the derating curve is intersected. Then move at a 
right angle to the left until intersecting the Y-axis. A 
value of nom+ or nom— something is found. This is 
the nominal value plus or minus some amount. The 
nominal value is the value found in the data sheet. Add 
the offset from the curve to this nominal value to get 
the resulting delay corresponding to the capacitive 
loading in the system. Note: The trace capacitance was 
not included in this calculation. It is accounted for in 
the trace propagation delay mentioned earlier. 


DRAM CYCLES WITH 82385 ENABLED 


When the 82385 is enabled (the CACHE variable of the 
state machine on page B-2 1s forced High), the DRAM 
controller inserts one extra wait state in all read cycles. 
This extra time is needed to allow a cache update cycle 
to occur after each cache read miss cycle. During a 
cache update cycle, the read data from DRAMs must 
propagate through the 74AS646 and the 74F245 (op- 
tional) and must be ready for a SRAM write cycle with 
enough setup time. 


Timing diagrams on pages C-5 through C-9 show cache 
and DRAM cycles. 


TIMING CALCULATIONS 


Timing equations are found on pages D-8 and D-9. 
Only tCAS, tRAC, tCAC, tAA, tPC, and tCAP are 
different in this configuration. Actual values for 
DRAM timings are found on page D-10. 


3.5 1/0 - EPROM Subsystem 


A block diagram of the I/O-EPROM subsystem is 
shown on Figure 3.8. This subsystem has separate ad- 
dress and data busses. The address bus is 14 bits wide, 
and the data bus is 16 bits wide. 


The bus controller is designed with B-speed PALs 
which are clocked by the CLK # signal (Figure 3.8). 
There are a few unique design issues in this scheme. 


As shown on Figure 3.10, ADS# is now an asynchro- 
nous signal for the state machine. It is impossible for 
the state machine to capture valid ADS# without re- 
synchronization of the signal. To guarantee recognition 
of valid ADS#, two D flip-flop is clocked by CLK # 
and provides a synchronous ADS# (or Latched 
ADS #) which is in phase with the state machine. 


The second issue is its asynchronous nature of the state 
machine output signal. With the state machine running 
almost asynchronously to CLK2 (B PALs also have a 
long clock-to-output propagation delay), signals gener- 
ated by the state machine must be re-synchronized be- 
fore they are returned to the 386 DX. Signals that go to 
I/O devices and EPROMs need no re-synchronization 
since these devices are asynchronous. Signals which re- 
quire re-synchronization are BS16# and DEN #. Each 
rising edge of DEN # is synchronized to CLK2 by a 
J-K flip-flop as shown on Figure 3.9. This is important 
to avoid bus contention after an I/O or EPROM read- 
cycle. BS16# is synchronized to CLK2 by D flip-flops. 


EPROM and I/O cycle timings are shown on pages C- 
10 through C-13. The worst case is a write cycle to the 
82510 and may require as many as 14 wait states. 
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Figure 3-9. Control Logic for 1/0, EPROM Subsystem 
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APPENDIXB” 
STATE DIAGRAMS AND PALCODES- 


RAS# Generator _ : See Gg 3 DRAM 1—ABL 
| | |  CAS# 


CLK: MEMCS#*HIT#+ 
CLK“ MEMCS# : DRAMRDY# + 
CLK ‘refresh ‘DRAMRDY# _, 


RESET CLK: MEMCS# 


precharge 


_ CLK REFRESH * WAIT# 


RASH=0 — 


CLK*REFRESH | “Xo NAg=1 


— -240725-51 
MEMCS# = M/IO# e ADS# e [A31-0E{ 1FFFFFFF. (00000000}] ; 
LMEMCS# = MEMCS# + mreq 


5-660 


precharge = 0 


a=0 


AP-442 


Precharge 


precharge =0 
a=1 


precharge = 1 
a=1 


precharge = 1 
a=0 
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CAS #, READY # 


CAS# = 1 
-DRAMRDY# = 1 


CLK * MEMCS#° HIT# + 
CLK * MEMCS# + 
CLK * refresh 


CAS#=0 
DRAMRDY# = 0 


CAS# = 1 
\ DRAMRDY# = 1 


CLK * RAS#* MUXOE# 


~ CAS#=0 
DRAMRDY# = 1 


CLK ‘ cache + 
CLK ‘ lwr 


CAS#=0 
DRAMRDY# = 1 


CLK 


CLK * MEMCS# ‘HIT# 


CAS# = 1 
DRAMRDY}# = 1 


CLK: RAS#* Iwr 
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CAL Generator 


Refresh 


7) ELSE 


RESET 


refresh =0 
wait# = 0 


CLK ° refreq 


refresh = 1 
wait# = 0 


refresh = 1 
wait# = 0 


refreq ‘CLK 


refresh =0 
wait# = 1 


: | NA#* CASH 
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MUXOE #, REF # Generator 


MUXOE# = 0 


T2X# = 1 
REF# = 1 # 


TIP#=1 
a | ae “eT 


CLK ° refreq * RAS#* MEMCS# 
| + CLK ‘refresh ‘ RAS# * DRAMRDY# 


MUXOE# = 1 


CLK ° ADS#° CLK * ADS# 
REF # = 1 


READY# 
T2X#=0 
—TIiP#=1 

sh i "Tzp", 


MUXOE# = 1 
E# "Tz" 


REF# =0 
CLK: RAS# CLK *ADS#* READY# 


MUXOE# = 1 
REF#=0 


MUXOE# = 1 
REF#=1 
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CLK: DEN#- IWR 


CLK MEMCS#° W/R#* T2X# 
+ IWR ° TIP# 


240725-58 
(DRAM3) 


READY# * MEMCS# 


READY# | + READY# * W/R# 
CLK MEMCS#° W/R# 


CLK MEMCS#° W/R# * T2X# + 
1WR * TIP# 
240725-59 
(DRAM3) 
240725-60 
(DRAM3) 7 
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MEMCS# * RAEADY# 


(DRAM3) 


AP-442 


CLK: READY# 


CLK - MEMCS# 


CLK MEMCS# * W/R# * T2X# + 
CLK* mreq ‘ TIP# + 
mreq °T2X# 


- 240725-61 | 240725-62 


(DRAM3) 
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ntel 
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"temporary stage in sync of 61.44MHz clk 


module RESET_GEN flag ’-r3’ 
title ’RESET GENERATION LOGIC - INTEL CORPORATION’ 
RESET PAL device ’P16R8’; 
x = .X.3 "ABEL don’t care symbol 
Ce i,0es "ABEL clocking input sybol 
" Inputs 
CLK2 pin 1; "“CLK2 
RESTRIG pin 2; “signal from reset circuitry 
CLK 61 pin 9; "61.44KHz clock 
" Outputs 
REFREQ pin 12; "“REFREQ, sync 61.44KHz clock 
RFQTMP pin 13; 
CLK~ pin 16; "CLK# 
CLK pin 17; "CLK = CLK2 / 2 
RESTMP pin 18; "temporary stage in generating RESET 
RESET pin 19; "RESET 
equations 
CLK := (!CLK # (!RESTMP & RESET)); 
CLK~ := CLK; 


RESTMP := RESTRIG; 
RESET := RESTMP; 
RFQTMP := CLK 61; 
REFREQ := RFQTMP; 


test_vectors 


({CLK2, CLK_61, RESTRIG, CLK, CLK~, RESTMP, RESET, RFQTMP, REFREQ] 


[CLK, CLK~, 


NAO 


C 
L 
K 


— Or 


Dm DAM Dw 


C 
L 
K 


RESTMP, RESET, RFQTMP, REFREQ]) 


rAra 


7 =zH4NMD 


Amu m 5 


amanm wa 


C 
L 


LAr ae 


CEA NMD 


Amon wo 


cSt onnw 


oma nm wz 


~> 


" clk generation 


PAL Codes: RESET 
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" restmp gen 


" 61.44KHz clk 
x, 
1, 
x, 
end RESET GEN; 
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ABEL(tm) 3.10 - Document Generator. 14-Feb-90 09:53 AM 
RESET GENERATION LOGIC - INTEL CORPORATION 
Equations for Module RESET GEN 


Device RESET PAL 


- Reduced Equations: 
ICLK := (CLK & !RESET # CLK & RESTMP); 
ICLK~ := (!CLK); 
IRESTMP := (!RESTRIG); 
IRESET := (!RESTMP); 
IRFQTMP := (!CLK_61); 


IREFREQ := ({RFQTMP); 240725-D4 


PAL Codes: RESET (Continued) 
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ABEL™ 3.10—Document Generator 14-Feb-90 09:53 AM 
RESET_.GENERATION__LOGIC—INTEL CORPORATION 
Chip diagram for Module RESET_.GEN 


Device RESET__PAL 


130 J RFQTMP 
12 LJ REFREQ 
110 J REFREQ_E 
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module ADDR_DEC flag ’-r3’ 
title ‘ADDRESS DECODE LOGIC - INTEL CORPORATION’ 
ADDRPAL device 'PI6L8’;) 


= hers "ABEL don’t care symbol 
# C3 “ABEL clocking input sybol 


" Inputs 


ADS~ pin 1; "ADS# 
MI10- pin 2; "M/I0# 
A31 pin 3; "Addr bit 31 


A30 pin 4; “Addr bit 30 
A29 pin 5; "Addr bit 29 
A6 pin 9; “Addr bit 6 


mreq pin 11; “Latched memory chip select. 
" Outputs - 


MEMCS~ pin 18; "Memory chip select 
59CS~ pin 15; "8259A chip select end ADDR_DEC; 
510CS~ pin 14; "82510 chip select . . 

EPRDM~ pin 13; "EPROM chip select 240725-93 
LMEMCS~ pin 12; "Latched/unlatched memory chip select 


equations 


IMEMCS~ = !ADS~ & M_I0~ & !A31 & !A30 & 1A29; 

ILMEMCS~ = (!ADS~ & M_I0~ & !A31 & !A30 & 1x29) # mreq; 

! 59CS~ = !M I0~ & !A6; 
!510CS~ = 1M 10~ & AG; 

{EPRDM~ = M_I0~ & A31 & A30 & A29; 


test_vectors 


({ADS~, M_I10~, A31, A30, A29, A6, mreq, MEMCS~] -> 
_(MEMCS~, ~LMEMCS~, _59CS~, “510CS-, EPRDM~]) 


" AMAAAAOMM MLS55E 
"DP 33 26rE EM 9.1 P 
"§T109 eM M EC OR 
* ~ 0 q Cc CMS COD 
" zm § $C ~ SM 
w oy “s S ods Sa 


(1, X, X, Xy X, X, 0, 1) <> {1, 1; X, X, x]; 


{1, xX, X, X, X, X, 1, 1] -> [1, 0, x, x, x]; "LMEMCS~ © 
[0, 1, 0, 0, 0, x, x, x] -> (0, x, 1, 1, 1); 
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ABEL(tm) 3.10 - Document Generator 14-Feb-90 09:50 AM 
ADDRESS DECODE_ LOGIC - INTEL CORPORATION 
Equations for Module ADDR_DEC 


Device ADDR_PAL 


- Reduced Equations: 


IMEMCS~ = (!A29 & !A30 & !A31 & !ADS~ & M I0~); 


ILMEMCS~ = (mreq # !A29 & !A30 & 1A31 & !ADS~ & M_I0-); 
! 59CS- = (!A6 & IM I0-); 


! 510CS~ = (AG & !M I0~); 
IEPROM~ = (A29 & A30 & A31 & M_I0-); 


PAL Codes: Address Decoder (Continued) 
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- ABELT™ 3.10—Document Gedemior | | 14-Feb-90 09:50 AM 
ADDRESS__DECODE__LOGIC—INTEL CORPORATION ee ® : | 
Chip diagram for Module ADDR__DEC 


Device ADDR__PAL | 


1S~LJ_59CS~ 
140) _510CS~ - 
13 LJ EPRDM~ 
12 LJLMEMCS~ 
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module PAGE_MODE_ DRAM CTRL_1 flag ’-r3’ 
title | ’PAGE MODE DRAM CONTROLLER - PAL 1, INTEL CORPORATION’ 
PAGE] device ’P16R8’ ; 


has " ABEL ‘don’t care’ symbol 
Se " ABEL ‘clocking input’ symbol 


" Inputs 


"80386 CLK2 

"Processor Clock 

“Memory Chip Select 
"Latched/Unlatched Memory Chip Select 
"DRAM Page Hit Signal 

“Column Address Strobe 

7; "DRAM Ready Signal . 

"Refresh Request Signal 

"System Reset 


CLK2 pin 


owe wr we we 


de we 
= 


refreq pin 
RESET pin 


we 


OWOVT AMP WN— 
awe 


we 


" Outputs 


RAS~ pin : "Row Address Strobe 

NA~ pin : "Next Address Signal 

precharge in 14; "RAS Precharge Signal 

a pin : 

wait~ pin ; “delays RAS~ until refresh adress is valid 

CAL pin ; “Column Address Latch 

refresh pin ; "Refresh Signal (active once refresh is acknowledged. ) 


unused pin 
state diagram [RAS~, NA~] 


state [1, 1]: if precharge then [1, 1] else 
if (CLK & refresh & wait~) then [0, 1] else 
if (CLK & ILMEMCS~& !refresh) then (0, 0} else [1, 1]; 
state [0, 0]: if RESET then [1, 1] else 
if CAS~ then [0, 0] else 
if (CLK & !MEMCS~ & HIT~ # 
CLK & MEMCS~ & !DRAMRDY~ # 
CLK & refresh & !DRAMRDY~) then [1, I] else [0, 0]; 
state [0, 1]: if RESET then [1, 1] else 
if (CLK & !refresh) then (1, 1] else (0, 1]; 
state [1, 0]: goto [1], 1]; 


state_diagram [precharge, a] 


state [0, 0]: if (!RAS~) then [0, 1] else [0, 0]; 
state [0, 1]: if (RESET) then [0, 0] else 
if (RAS~) then [1, 1] else (0, 1]; 
state [{1, 1]: goto [1], 0]; 
state [1, 0]: if (CAS~) then (0, 0] else [1, 0]; 
240725-94 
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state diagram [CAL] 


state [1]: if (!'NA~ & CAS~) then [0] else [1]; 
state [0]: if (RESET) then [1] else 
if (!CAS~) then [1] else [0]; 


state_diagram [refresh, wait~] 


state[0, 0}: if (CLK & refreq) then [1, 0] else [0, 0]; 
state[1, 0]: if (RESET) then [0,0] else 

if (CLK & MEMCS~) then [1, 1] else [1, 0]; 
state[{l, 1]: if (RESET) then [0,0] else 

if (CLK & NA~ & !RAS~) then [0, 1] else [1, 1]; 
state[0, 1]: if (RESET) then [0,0] else 

if (CLK & !refreq) then [0, 0] else [0, 1]; 


test_vectors 


({CLK2,CLK ,MEMCS~, LMEMCS~ ,HIT~,CAS~,DRAMRDY~,refreq, RESET] -> 
[RAS~,NA~, precharge, CAL, refresh] ) 


Cc Cc M LHC OD r R R Np Cr 

i L L EMIAReE AAr Ae 

"KK METS A f S S ~ el f 

wy ee C M-~--~ Mre ~ c r 

‘ S- +E RoeT h e 

. ~ § Dq a $s 

: ~ Y r h 

: a. g 

e 

15. Xy Ky Mos Ke 1g Me 1} -> [1, 15° Xs. ly 0]; 
[Cy yy Ke Ka By Ty - Ky FE) => Te ly e101; 
[c, 1, 1, 1, x, 1, 1, 0, 0) -> [1, 1, x, 1, 0]; "Ti, phase 1 
{é,.0, 1. 1,-%. Fy Ty G,. 0) -> (hy ee 15. 0)]5* phase 2 
{c, 1, 1, 1, x, 1, 1, 0, OJ -> [1, 1, x, 1, OJ; "Tl, Read, Non-Pipelined 
{c, 0, 0, 0, X, l, 1, 0, 0) “> {1, 1, 0, 1, 0); r 
{c, 1, 0, 0, x, 1, 1, 0, 0] -> (0, 0, 0, 1, 0); "T2 
[c, 0, 1, 0, x, 1, 1, 0, 0] -> (0, 0, 0, 0, 0]; 
{c, 1, 1, 0, x, 1, 1, 0, 0] -> [0, 0, 0, 0, 0}; “T2P 
fc, 0, 0, 0, x, 0, 1, 0, 0] -> [0, 0, 0, 1, 01; "| Page Hit 
fc, 1, 0, 0, 0, 0, 1, 0, 0] -> [0, 0, 0, 1, 0]; "T2P 
[c, 0, 0, 0, 0, 0, 0, 0, 0} -> [0, 0, 0, 1, OJ; - . 
{[c, 1, 0, 0, 0, 0, 0, 0, OJ] -> [0, 0, 0, 1, OJ; "TIP, Read, Pipelined 
[c, 0, 1, 0, 0, 1, 1, 0, O}) -> [0, 0, 0, 0, 0}; 
[c, 1, 1, 0, 0, 0, 1, 0, 0] -> [0, 0, 0, 1, 0]; "T2P 
ic, 0, 0, 0, 0, 0, 0, 0, 0] -> [0, 0, 0, l, 0]; ‘ 
[c, 1, 0, 0, 0, 0, 0, 0, 0] -> [0, 0, 0, 1, 0]; "TIP, Write 
[c, 0, 1, 0, 0, 1, 1, 0, 0] -> [0, 0, 0, 0, 0); 
{c, 1, 1, 0, 0, 1, 1, 0, 0] -> [0, 0, 0, 0, OJ; “T2P 
[c, 0, 0, 0, 0, 1, I, 0, 0] -> [0, 0, 0, 0, 0]; 
{[c, 1, 0, 0, 0, 1, 1, 0, 0) -> [0, 0, 0, 0, 0}; "T2P 
[c, 0, 0, 0, 0, 0, 0, 0, 0) “> (0, 0, 0, 1, 0); 
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ABEL({tm) 3.10 - Document Generator 15-Feb-90 05:47 PM 
PAGE MODE DRAM CONTROLLER - PAL 1, INTEL CORPORATION 
Equations for Module PAGE MODE_DRAM CTRL_1 


Device PAGE] 


- Reduced Equations: 


TRAS~ := (NA~ & !RAS~ & !RESET & refresh 
# DRAMRDY~ & !HIT~ & !NA~ & !RAS~ & !RESET 
# DRAMRDY~ & MEMCS~ & !NA~ & !RAS~ & !RESET 
# !HIT~ & !MEMCS~ & INA~ & !RAS~ & !RESET & !refresh 
# !CLK & !RAS~ & !RESET . 
# CAS~ & !INA~ & !RAS~ & !RESET 
# CLK & !LMEMCS~ & NA~ & RAS~ & !precharge & !refresh 
# CLK & NA~ & RAS~ & !precharge & refresh & wait~); 


'NA~ := (DRAMRDY~ & !HIT~ & !NA~ & !RAS~ & !RESET 
# DRAMRDY~ & MEMCS~ & !NA~ & !RAS~ & !RESET 
# 'HIT~ & !MEMCS~ & !NA~ & !RAS~ & !RESET & !refresh 
# ICLK & !NA~ & !RAS~ & !RESET 
# CAS~ & !NA~ & !RAS~ & !RESET 
# CLK & !LMEMCS~ & NA~ & RAS~ & !precharge & !refresh); 


!precharge := (CAS~ & !a 
# '!RAS~ & ! precharge 
# RESET & ! precharge 
# !a & !precharge); 


!a := (precharge # RESET & a # RAS~ & !a); 
ICAL := (ICAL & CAS~ & IRESET # CAL & CAS~ & !NA~); 


lrefresh := (!refresh & wait~ 

, # CLK & NA~ & !RAS~ & wait~ 
# RESET & refresh 
# !refreq & !refresh 
# !ICLK & !refresh); 


lwait~ := (CLK & !refreq & !refresh 
# \MEMCS~ & !wait~ 
# ICLK & !wait~ 
# RESET 
7 t!refresh & !wait~); 
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ABEL™ 3.10—Document Generator 15-Feb-90 05:47 PM 
PAGE MODE DRAM CONTROLLER—PAL 1, INTEL CORPORATION 
Chip diagram for Module PAGE_MODE__CTRL_1 


Device PAGE1 
P16R8 


MEMCS~ CJ 3 
LMEMCS~ LJ 4 


DRAMRDY~ LJ 7 


refreqlLj 8 


11~J RAS~_E 
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ABEL(tm) 3.10 : Document Generator 1§-Feb-90 06:16 PM 
PAGE MODE DRAM CONTROLLER - PAL 2, INTEL CORPORATION 
Equations for Module PAGE _MODE DRAM CTRL_2 


Device PAGE2 


- Reduced Equations: 


ICAS~ := (CAS~ & CLK & DRAMRDY~ & !RESET & !a & !b 
# !CACHE & DRAMRDY~ & !RESET & a & !b & !Iwr 
# DRAMRDY~ & !RAS~ & !RESET & a & !b & !lwr 
# ICAS~ & ICLK & IRESET & a & b 
# !CAS~ & DRAMRDY~ & !RESET & a 
# CAS~ & CLK & DRAMRDY~ & !MUXOE~ & !RAS~ & a & b); 


x 


IDRAMRDY~ := (CAS~ & CLK & DRAMRDY~ & !RESET & !a & !b 
# ICAS~ & !CLK & !DRAMRDY~ & !RESET & a & b 
# !CAS~ & CLK & DRAMRDY~ & !RESET & a & !b 
# !CAS~ & CLK & DRAMRDY~ & RESET & a & Iwr 
# !CACHE & !CAS~ & CLK & DRAMRDY~ & !RESET & a); 


= (CAS~ & ICLK & DRAMRDY~ & !RESET & !a & !b 
# CAS~ & CLK & DRAMRDY~ & !RAS~ & !RESET & a & !b & Iwr); 


b := (CAS~ & !CLK & DRAMRDY~ & !RESET & !a & !b 
# CAS~ & DRAMRDY~ & RAS~ & !RESET & a & !b 
# CACHE & CAS~ & DRAMRDY~ & !RESET & a & !b 
/ CAS~ & DRAMRDY~ & !RESET & a & !b & Iwr 
# !CAS~ & CLK & !DRAMRDY~ & !MEMCS~ & !RAS~ & !RESET &a&b & 
Bight 
# !CAS~ & ICLK & DRAMRDY~ & !RESET & a & !b 
# CACHE & !CAS~ & CLK & DRAMRDY~ & !RESET & a & b & !Iwr); 


IMUXOE~ := (!MUXOE~ & !REF~ 
# REF~ & Ir 
# MUXOE~ & RESET 
# DRAMRDY~ & !MUXOE~ & !RAS~ 
# !MEMCS~ & !MUXOE~ & RAS~ . 
# !MUXOE~ & !refresh 
# ICLK & !MUXOE~); 


!REF~ := (MUXOE~ & !RESET & r); 


ty := (MUXOE~ & !REF~ & !RESET & !r 
# CLK & MUXOE~ & IRAS~ & IREF~ & RESET); 
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PAGE MODE DRAM CONTROLLER—PAL 2, INTEL CORPORATION 
Chip diagram for Module PAGE__MODE__DRAM__CTRL__2 


Device PAGE2 
P16R8 
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module  -—~PAGE_MODE_DRAM_CTRL_2 flag./-r3/ 


title ‘PAGE MODE DRAM CONTROLLER - PAL 2, INTEL CORPORATION’ 


PAGE2 device ~ ‘’PI16R8’; 

x = X.5 “ ABEL ‘don’t care’ symbol 

c 2 es " ABEL ‘clocking input’ ee 
" Inputs 

CLK2 pin 1; "80386 CLK2 

CLK pin 2; “Processor Clock 

RAS~ pin 3; “Row Address Strobe 

MEMCS~ pin 4; "Memory Chip Select 

HIT~ pin 5; "DRAM Page Hit Signal (unused). 

CACHE pin 6; “Hi when 385 is used; otherwise, Low 

|wr pin 7; "“Latched Write/Read 

refresh pin 8; "Refresh Signal 

RESET pin 9; "System Reset 
" Outputs 

CAS~ pin 12; "Column Address Strobe’ 

DRAMRDY ~ pin 13; "DRAM Ready 

a pin 14; 

b pin 15; 7 


unused pin 16; " 
MUXOE~ pin 17; "DRAM Address Multiplexer Output Enable 
REF~ pin, 18; “Enables refresh counter instead of MUX 
r pin 19; © . . 
cstate = [CAS~, DRAMRDY~,a, b]; 
idle = [H1, » 1]; "Idle 
start = [ ; 1, 1]; "CAS~ Active 
wait = [ ] 1, 0]; "“"CAS~ Active, Wait State 
active = [ » O ae 
inactive_1 1 ’ 
Inactive 
inactive 2 
Inactive 


"CAS~ and DRAMRDY~ Active 
1, 0]; “Page Hit, CAS~ and DRAMRDY- 


9, 0); "Page Hit, CAS~ and DRAMRDY- 


0 
0, 
0 
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[MUXOE~, REF~, rj]; 
oe | , 1]; "Multiplexer Outputs Enabled 


muxstate 
enabled = [ 


oO tl 
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disabled 1 ; "Multiplexer Outputs Disabled 
disabled 2 ; "Refresh Address Enabled 
disabled 3 ; "Wait for RAS# 

disabled 4 ; "Refresh Address Disabled 


illegal y 
illegal x 


state diagram cstate 


state idle: if (CLK & !RAS~ & !MUXOE~) then start else idle; 
State start: if RESET then idle else 
if (CLK & !CACHE # CLK & Iwr) then active else 
if CLK then wait else start; 
state wait: if RESET then idle else 
if CLK then active else wait; 
State active: if RESET then idle else 
if (CLK & !MEMCS~ & RAS~ # 
CLK & MEMCS~ # 
CLK & refresh) then idle else 
if (CLK & !MEMCS~ & !RAS~) then inactive 1 
else active; 
inactive_1: if RESET then idle else 
if (CLK & !RAS~ & Iwr) then inactive 2 else 
if (1RAS~ & !Iwr & CACHE) then start else 
if (!lwr & !CACHE) then wait else 
inactive 1; 
state inactive 2: if RESET then idle else 
if CLK then active else inactive 2; 
state i : goto idle; 
state i : goto idle; 
state i ; goto idle; 
state i : goto idle; 
state j : goto idle; 
state ij : goto idle; 
state i : goto idle; 
state i : goto idle; 
state j i: goto idle; 
state j _j: goto idle; 


state_diagram muxstate 


state enabled: if (CLK & refresh & RAS~ & MEMCS~ # 

CLK & refresh & !RAS~ & !DRAMRDY~) then 

disabled 1 else enabled; 
state disabled 1: if (RESET) then enabled else disabled 2; 
state disabled 2: if (RESET) then enabled else 

if (CLK & !RAS~) then disabled 3 else disabled 2; 
state disabled 3: if (RESET) then enabled else disabled 4; 
state disabled 4: goto enabled; 7 
State illegal_z: goto enabled; 
state illegal_y: goto enabled; 
state illegal_x: goto enabled; 
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module PAGE MODE DRAM CTRL. 3 flag ’-r3’ 
title “PAGE MODE DRAM CONTROLLER - PAL 3, INTEL CORPORATION’ 
PAGE3 device ’P16R8’ ; 


shoe " ABEL ‘don’t care’ symbol 
x3 " ABEL ‘clocking input’ symbol 


"80386 CLK2 
"Processor Clock 
"Address Strobe 
"Memory Chip Select 
5; “Write/Read 
"System Ready 

"DRAM Ready 


we we 


we 


we 


unused] 
RESET 


we 


WON HAD PWN 
oe Te ad 


"System Reset 


we 


" Outputs 


T2X~ pin 12; "active during 12, T2p, and 712i 

T1P~ pin 13; "active during Tlp 

WE~ pin 14; "DRAM Write Enable 

DEN~ pin 15; "DRAM Data Bus Transceiver Enable 

DTR pin 16; “DRAM Data Bus Transceiver R/W# Direction signal 
lwr pin 17; "Latched Write/Read 

mreq pin 18; “Latched Memory Chip Select 

unused2 pin 19; sa 


state diagram [T2X~, T1P~] 


state [1, 1]: if (CLK & !ADS~) then [0, 1] else [1, 1]; 
state [0, 1]: if RESET then [1, 1] else 

if (CLK & !ADS~ & !READY~) then [1, 0] else 

if (CLK & ADS~ & !READY~) then [1, 1] else [0, 1]; 
state [1, 0]: if RESET then [1, 1] else 

if (CLK) then [0, 1} else [1, 0]; 
state [0, 0]: goto [1, 1]; 


state_diagram [WE~] 


state [1]: if (CLK & !MEMCS~ & WR & T2X~ # 
Twr & !TIP~) then [0] else [1]; 
state [0]: . If (RESET) then [1] else 
if (CLK & !READY~) then [1] else [0]; 


state diagram [DEN~] 


state [1]: if (CLK & !MEMCS~ & !WR & T2X~ # 
mreq & !T2X~ # 
CLK & mreq & !TIP~) then [0] else [1]; 
240725-A4 
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state [0]: if RESET then [1] else 


state diagram [OTR] 


state [1]: if (CLK & IMEMCS~ & WR & T2X~ # 


Iwr & !TIP~) then [0] else [1]; 
state [0]: if (RESET) then [1] else 


if (!CLK & DEN~ & !Iwr) then [1] else [0]; 
state_diagram [Iwr] 
state [0]: if (CLK & !MEMCS~ & WR) then [1] else [0]; 
state [1]: if (RESET) then [0] else 
if (!READY~ & MEMCS~ # 
!READY~ & !WR) then [0] else [1]; 
state diagram. [mreq] 
‘state [0]: if (CLK & IMEMCS~) then [1] else [0]; 
state [1]: if RESET then [0] else 
if (!READY~ & MEMCS~) then [0] else [1]; 
test_vectors 


({CLK2,CLK,ADS~,WR,MEMCS~,READY~,RESET] -> 
{T2X~,T1P~,DEN~, }wr,WE~,DTR, mreq]) 


C C AWM RR T TOW Om 

° L LE DR EEE 21EwETr 

‘ K K § MAS X P Nr-~Re 

" 9 2 CODE ete q 

. Ss Y T 
[c, xX, xX, X, x, X, 1} -> [1, 1, 1, 0, 1, 1, x); 
[c, X, xX, X, xX, X, 1} -> [1, 1, 1, 0, 1, 1, 0); 
es 1,14 Xs dy. Ts 0] -> (1, 1, 1, 0, 1, 1, 0]; 
[c, 0, 1, x, 1, 1, 0] -> {1, 1, 1, 0, 1, 1, OJ; "Ti 
[c, I, 1, X, l, l, 0} ~> {1, 1, I, 0, l, 1, 0}; 
[c, 0, 0, 0, 0, 1, 0} -> {1, 1, 1, 0, 1, 1, 0}; "Tl 
[c, l, 0, 0, 0, l, 0} -> [0, > 0, 0, 1, l, 1}; ; 
{c, 0, 1, 9, 1, 1, 0} -> [0, 1, 0, 0, 1, 1, 1); "Te 
[c, 1, 1, 0, 1, 1, 0] -> [0, 1, 0, 0, 1, 1, 1}; 
[c, 0, 0, 0, 0, 1, 0] -> [0, 1, 0, 0, 1, 1, 1}; "T2 
[c, 1, 0, 0, 0, 1, 0} -> {o, l, 0, 0, l, 1, 1}; : 
[c, 0, 0, 0, 0, 0, OJ -> [0, 1, 0, 0, 1, 1, 1); "TeP 
[c, l, 0, 0, 0, 0, 0} -> {], 0, Ly 0, l, 1, 1); 
[c, 0, 1, 0, l, 1, 0] -> {1, 0, 1, 0, l, l, 1}; "TIP 
{[c, 1, 1, 0, 1, 1, OJ -> [0, 1, 0, 0, 1, 1, 1); — 
{[c, 0, 0, 1, 0, 0, OJ -> [0, 1, 0, 0, 1, 1, 1]; “TeP 
{c, 1, 0, 1, 0, 0, 0) -> A ae oe Poe ee 1}; 
[{c, 0, | ee ae 0] -> {l, nods 1, 0, 0, 1); "TIP 
{c, 1, i 1, 1, 1, 0} a2 [0, > 0, 1, 0, 0, 1); 
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ABEL(tm) 3.10 - Document Generator 14-Feb-90 09:54 AM 
PAGE MODE DRAM CONTROLLER - PAL 3, INTEL CORPORATION 
Equations for Module PAGE MODE DRAM CTRL_3 


Device PAGE3 


- Reduced Equations: 


1T2X~ : aCe & !RESET & !TIP~ & T2X~ 
# READY~ & !RESET & TIP~ & !T2X~ 
# !CLK & !RESET & TIP~ & !T2X~ 
# !ADS~ & CLK & T1P~ & T2Xx~); 


1T1P~ := (!CLK & IRESET & !TIP~ & T2X~ 
# !ADS~ & CLK & !READY~ & !RESET & Tipe & !T2X~); 


IWE~. := (READY~ & !RESET & !WE~ 
# ICLK & !RESET & !WE~ 
# !TIP~ & WE~ & Iwr 
# CLK & !MEMCS~ & T2X~ & WE~ & WR); 


!1DEN~ := Book & READY~ & !RESET 
# ICLK & !IDEN~ & !RESET 
# CLK & DEN~ & !T1P~ & mreq 
# DEN~ & !T2X~ & mreq 
# CLK & DEN~ & !MEMCS~ & T2X~ & !WR); 


IDTR := ls & !RESET & Iwr 
| # IDEN~ & !O0TR & {RESET 
# CLK & !DTR & !RESET 
# DTR & !TIP- & Iwr 
# CLK & DTR & !MEMCS~ & T2X~ & WR); 


'Iwr := ('READY~ & IWR 
# MEMCS~ &. !READY~ 
# RESET & Iwr 
# IWR & !lwr 
# MEMCS~ & !lwr 
# ICLK & !Iwr); 


Imreq := (MEMCS~ & !READY~ 
# RESET & mreq 

H MEMCS~ & !mreq 

# !CLK & !mreq); 
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ABEL™ 3.10—Document Generator | 14-Feb-90 09:54 AM 
PAGE MODE DRAM CONTROLLER—PAL 3, INTEL CORPORATION | | 
Chip diagram for Module PAGE__MODE__DRAM__CTRL__3 


Device PAGE3 
P16R8 


110 J _T2X~_E 


240725-67 


PAL Codes: DRAM 3 (Continued) 


5-687 


intel | _ AP-442 


module —~ PAGE_MODE_DRAM CTRL_4 flag ’-r3’ 


title ‘PAGE MODE DRAM CONTROLLER - PAL 4, INTEL CORPORATION’ 
PAGE4 device —S« PIGR8’; 
x a) tks " ABEL ‘don’t care’ symbol. 
c 2 Fa ee " ABEL ‘clocking input’ symbol 
" Inputs 
CLOCK pin 1; 
DO pin 2; 
D1 pin 3; 
D2 pin 4; 
D3 pin 5; 
D4 pin 6; 
D5 pin 7; 
D6 pin 8; 
D7 pin 9; 
OE pin 11; 
" Outputs 
AO pin 12; 
Al pin 13; 
A2 pin. 14; 
A3 pin 15; 
A4 pin 16; 
AS pin. 17; 
~ AG ‘pin. 18; 
A7 pin 19; 


addr = [A7..A0]; 
equations. 
addr = addr + 15 


end PAGE_MODE_DRAM CTRL 4; 
240725-A9 


PAL Codes: DRAM 4 
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Equations for Module PAGE MODE DRAM CTRL 4 


Device PAGE4 


- Reduced Equations: 


'A7 := (AO & Al & A2 & A3 & A4 & AS & AG & A? 
1AO & !A7 
1Al & 
1A2 & ! 
'A3 & ! 
1A4 & 
1A5 & 
1A6 & 


(AO & Al & A2 & A3 & A4 & AS & AG 
# YAO & 1A6 
# 1Al1 & ! 

1A2 & ! 

1A3 & ! 

1A4 & | 

1A5 & } 


(AO & AL & AZ & AS & AG BAS 
# YAO & IAS 

1Al & AS 

1A2 & IAS 

103 & JAS 

1A4 & 1A5); 


(AO & Al & A2 & A3 & A4 

# 1AO & IAG 

# Al & IAG 

# 1A2 & IAG 

# 1A3 & 1A4); 

(AO & Al & A2 & AZ # HAO & LA3 # Al & 1A3 # IA2 & 1A3); 
(AO & Al & A2 # !AO & !A2 # IAL & 1A2); 

(AO & Al # !AO & !Al); 


: “(A0); 


240725-B0 


PAL Codes: DRAM 4 (Continued) 
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Chip diagram for Module PAGE__MODE__DRAM__CTRL__4 . 


Device PAGE4 
P16R8 


240725-68 


end of module PAGE__MODE__DRAM__CTRL__4 
PAL Codes: DRAM 4 (Continued) 
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module TO CTRL_1 flag ’-r3’ 
title ‘TO BUS CONTROLLER - PAL 1, INTEL CORPORATION’ 


101 device ‘P16R4’; 

Xx = X53 “ ABEL ‘don’t care’ symbol 

c = i ee " ABEL ‘clocking input’ symbol 
" Inputs 

CLK pin 1; “Processor Clock 

RESET pin 2; "System Reset 

MRDC~ pin 3; "Memory (EPROM) Read Command 

IORC~ pin 4; "I/O Read Command 

IOWC~ pin 5; "I/O Write Command 

INTA~ pin 6; “Interrupt Acknowledge 

DEN~ pin 7; .“I/0 Bus Data Transceiver Enable 

IORDY~ pin 8; "I/0-EPROM Ready 

L510CS~ pin 9; "82510 Chip Select 

OEN~ pin 11; “PAL output Enable 

L59CS~ pin 12; "8259A-2 Chip Select 


LEPROM~ pin 13; "EPROM Chip Select 
unused 0 pin 18; " 
unused _] pin 19; " 


" Outputs 
delay pin 14; " 
s2 pin 15; “ 
sl pin 16; " 
s0 pin 17; “ 
dstate = [delay, s2, sl, s0]; 
idle * f£ 1s ba by D5 
start . ae tae re ee ere 
wait_14 eo SP oO gy OOS 
wait_13 = { 7,0, 144.45 
wait 12 = [2.4 0-, 0.5 0 7; 
wait 11 = — 1,1,0,04); 
wait_10 = pt ig ds O42 J; 
active OD oe, cee chs oe 


state diagram dstate 


state idle: if (1DEN~ & IMRDC~ # !DEN~ & !IORC~ # 
IDEN~ & !IOWC~ # !DEN~ & !INTA~) then start 

else idle; 
state start: if (!L510CS~ & !IOWC~) then wait_14 else 

if (!L510CS~ & !IORC~) then wait_13 else 

if (!L59CS~ & !IOWC~) then wait_1ll else 

if ('LEPROM~ # !L59CS~ & !IORC~ # !INTA~) then wait_10; 
state wait_14: goto wait_13; 


240725-B1 


state wait_13: goto wait_12; 
state wait_12: goto wait_11; 
state wait_11: goto wait_10; 
state wait 10: goto active; 


state active: if !IORDY- then idle else active; 


end 10 CTRL_1; 
a é 


240725-B2 


PAL Codes: !O-1 


5-691 
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Equations for Module I0 CTRL_1 


Device IO] 


- Reduced Equations: 


Idelay := (IORDY~ & !delay & sO & sl & s2 # delay & sO & !s] & s2); 


Is2 := (delay & sl & !s2 
# !IORC~ & !L510CS~ & delay & !s0 & sl 
# 'IOWC~ & !L510CS~ & delay & !sO & s1); 


(delay & !s0 & !s1 

# delay & sO & sl] & !s2 
!INTA~ & IORC~ & IOWC~ & delay & !s0 & s2 
IORC~ & IOWC~ & !LEPROM~ & delay & !s0 & s2 
ITORC~ & LSIOCS~ & !L59CS~ & delay & !s0 & s2 
!INTA~ & L510CS~ & delay & !sO & s2 
L510CS~ & !LEPROM~ & delay & !sO & s2 
!TOWC~ & L510CS~ & !L59CS~ & delay & !s0 & s2); 


(delay & !sO & !s] & !s2 

# delay & sO & sl & !s2 

# !TOWC~ & !ILS9CS~ & delay & !sO & si & s2 
!TOWC~ & !L510CS~ & delay & !s0 & sl & s2 
IDEN~ & !INTA~ & delay & sO & s] 
IDEN~ & !IOWC~ & delay & sO & sl 
IDEN~ & !TORC~ & delay & sO & sl 
IDEN~ & !MRDC~ & delay & sO & sl); 


240725-B3 
PAL Codes: 10-1 (Continued) 
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Chip diagram for Module IO__CTRL__1 


Device [O1 


CLK C91 
RESET LJ 2 190 J unused_1 
MRDC~ CJ 3 18f_Junused_0O 
lORC~ C3 4 177 JsO 
iowCc~ LJ5S 1603 s1 


INTA~ C6 15£)s2 
DEN~ C57 1417. J delay 
lIORDY~ C38 13 EJ LEPROM~ 


L510CS~ LJ9 12LJLS59CS~ 


112 OEN~ 


end of module IO__CTRL_ 1 
PAL Codes: 10-1 (Continued) 
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15-Feb-90 06:40 PM 


module 10 CTRL_2 flag ’-r3’ 
title ‘10 BUS CONTROLLER - PAL 2, INTEL CORPORATION’ 


AP-442 


" ABEL ‘don’t care’ symbol 
" ABEL ‘clocking input’ symbol 


102 device ‘P16R6’; 
X = ye 
c «C33 
" Inputs 

CLK pin 1; "Processor Clock 
RESET pin 2; "System Reset 
LMIO pin 3; “Latched M/IO#, 
LOC pin 4; “Latched D/C# 

~ LWR pin 5; "“Latched W/R# 
LALE pin 6; "“Latched ALE 
L510CS- pin 7; "82510 Chip Select 
L59CS~ pin 8; ."8259A-2 Chip Select | 
LEPROM~ pin 9; "EPROM Chip Select 
OEN~ pin 11; "PAL Output Enable 
rdy~ pin 12; "“I/0-EPROM Ready (n-1) 
rdy510~ pin 19;  "I/O-EPROM Ready (n-2) : 

" Outputs 

recovery pin 13; "I/O Recovery Time 
$l pin 14; " 
$0 pin 15; "“ 
TORC~ pin 16; “I/O Read Command 
IOwC~ pin 17; "I/O Write Command 
MRDC~ pin 18; “Memory (EPROM) Read Command 
rstate = [recovery, sl, s0]; 
idle = [ 0 gol 5 8 Ts 
active = [{ 0 sty Dd; 
inactive 0 = [ 1 ¢ oy BOS 
inactive 1 = [ 1 »0,14; 
inactive 2 = [ 1 , 0,04; 
inactive 3 = [ 1 , 3. 4-02 ]3 
illegal a = [{ 0 ,90,04;3 
illegal b = [ O ,0,1 4; 


state diagram rstate 


state idle: 
-$tate active: 


state inactive 0: goto 
state inactive 1: goto 
state inactive 2: goto 
state inactive 3: goto 
state illegal a: goto 
state illegal b: goto 


State diagram 
state [1]: 
state [0]: 


state diagram 
state [1]: 
state [0]: 


state_diagram 
state [1]: 
state [0]: 
end [0 CTRL_2; 
“] 


{ IOWC~] 


inactive_1; 
inactive 2; 
inactive_3; 
idle; 
idle; 
idle; 


if (TIORC~ # !TOWC~) then active else idle; 
if (I1ORC~ # IOWC~) then inactive_0 else active; 


if (!recovery & !LMIO & LDC & LWR & (!L510CS~ # !L59CS~)) 
then [0] else [1]; 

if RESET then [1] else 
if (!L510CS~ & !rdy510~ # !Irdy~) then [1] else [0]; 


[10RC~] 


if (!recovery & !LMIO & LOC & ILWR & (!L510CS~ # !L59CS-)) 
then [0] else [1]; 

if RESET then [1] else 
if !rdy~ then [1] else [0]; 


[MRDC~] 


if (LALE & LMIO & !LWR & !LEPROM~) then [0] else [1]; 
if RESET then [1] else 
if !rdy~ then [1] else [0]; 


_ PAL Codes: 10-2 
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10 BUS CONTROLLER - PAL 2, INTEL CORPORATION 
Equations for Module IO CTRL 2 


Device 102 


- Reduced Equations: 
lrecovery := (!recovery & !sl # !ITORC~ & !IOWC~ & !recovery # !s0 & sl); 


's] := (recovery & sO); 


!s0 := (recovery & !sO # !sl # IORC~ & IOWC~ & !s0); 


1IOWC~ := (!IOWC~ & !RESET & rdy510~ & rdy~ 
# !TOWC~ & L510CS~ & !RESET & rdy~ 
# TOWC~ & !L59CS~ & LDC & !LMIO & LWR & !recovery 
# IOWC~ & !L510CS~ & LDC & !LMIO & LWR & !recovery); 


(!TORC~ & !RESET & rdy~ 
# IORC~ & !L59CS~ & LDC & !LMIO & !LWR & !recovery 
# IORC~ & !L510CS~ & LDC & !LMIO & !LWR & !recovery); 


({MRDC~ & !RESET & rdy~ 
# LALE & !LEPROM~ & LMIO & !LWR & MRDC~); 


PAL Codes: |O-2 (Continued) 
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IO BUS CONTROLLER—PAL 2, INTEL CORPORATION 
Chip diagram for Module IO__CTRL__2 


Device IO2 


19 Jrdy510~ 
18 [J MRDC~ 
17 J 1Owc~ 
16 JIORC~ 


L510CS~ (37 
LS59CS~ LJ8 132 J recovery 
LEPROM~ (J 9 | 
. mate 


240725-70 
end of module IO__CTRL__2 | 


PAL Codes: 10-2 (Continued) 
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module IO_CTRL_3 flag ’-r3’ 
title ‘TO BUS CONTROLLER - PAL 2, INTEL CORPORATION’ 
103 device ’P16R6’ ; 


Xx " ABEL ‘don’t care’ symbol 
c C33 " ABEL ‘clocking input’ symbol 


" Inputs 


CLK ; "Processor Clock 
RESET ; “System Reset 

LMIO ;  "“Latched M/IO# 

LOC i ; “Latched D/C# 

LWR ; "Latched W/R# 

LALE i ;  “Latched ALE 

L510CS~ i ; "82510 Chip Select 
L59CS~ i ; “8259A-2 Chip Select 
LEPROM~ i ; “EPROM Chip Select 
OEN~ i ; “PAL Output Enable 
rdy~ i ; “I/0-EPROM Ready (n-1) 
IORDY~ pin ;  "1/0-EPROM Ready 


" Outputs 


INTA~ pin ; "Interrupt Acknowledge 

st0- pin “a 

DEN~ pin ; "I/O Bus Transceiver Enable 
stl pin r 

DTR pin ; “I/O Bus Transceiver Direction 
st2 pin . 


state diagram [INTA~, st0] 


state [1, 1]: if (!LMIO & !LDC & !LWR & LALE) then [1, 0] else [1, 1]; 
state [1, 0}: if RESET then [1, 1] else 
if !LALE then [0, 0] else [1, 0]; 
state [0, 0]: if RESET then (1, 1] else 
if !rdy~ then [1, 1] else [0, 0}; 
state [0, 1]: goto [1, 1]; 


state diagram [DEN~, stl] 
state [1, 1]: if LALE & (!LEPROM~ # !L510CS~ # !LS9CS~) then [1, 0] else 


if !INTA~ then (0, 0] else [1], 1]; 
state [1, 0]: if RESET then [1, 1] else 

if !LALE then [0, 0] else [1, 0]; 
state (0, 0]: if RESET then [1, 1] else 

if !rdy~ then [1, 1] else [0, 0]; 
state (0, 1]: goto (1, 1]; 


state diagram [DTR, st2] 


state [1, 1]: if LALE & (!LEPROM~ # !L510CS~ # !L59CS~) & LWR then [0, 1] 


else [1, 1]; 
state [0, : if RESET then [1, 1] else 
if !IORDY~ then [0, 0] else (0, 1]; 
state [0, : goto {l, 1]; 
state [1], : goto {1, 1); 


end 10_CTRL_3; 
a 4 


PAL Codes: 10-3 
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10 BUS CONTROLLER - PAL 2, INTEL CORPORATION 
Equations for Module I0 CTRL 3 


Device 103 


- Reduced Equations: 


LINTA~ := (HINTA~ & IRESET & rdy~ & !st0 
# INTA~ & !LALE & IRESET & !st0); 


IstO := (!RESET & rdy~ & !st0 
# INTA~ & !RESET & !stO 
# INTA~ & LALE & !LDC & !LMIO & !LWR & stO); 


IDEN~ : = (WDEN- & !RESET & rdy~ & !stl 
# DEN~ & !LALE & !RESET & !stl 
# DEN~ & !INTA~ & L5IOCS~ & L59CS~ & LEPROM~ & stl 
# DEN~ & !INTA~ & !LALE & stl); 


Ist] := (!RESET & rdy~ & !stl 
# DEN~ & !RESET & !stl 
# DEN~ & !INTA~ & stl 
# DEN~ & !L59CS~ & LALE & stl 
# DEN~ & !L510CS~ & LALE & stl 
# DEN~ & LALE & !LEPROM~ & stl); 


1DTR := (IDTR & !RESET & st2 
# OTR & !L59CS~ & LALE & LWR & st2 
# DTR & !L510CS~ & LALE & LWR & st2 
# OTR & LALE & !LEPROM~ & LWR & st2); 


Ist2 := (1DTR & !IORDY~ & !RESET & st2); 
240725-B9 


PAL Codes: 10-3 (Continued) 
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IO BUS CONTROLLER—PAL 2, INTEL CORPORATION 
Chip diagram for Module IO__CTRL_3 


Device IO3 


L510CS~ (J 7 
L59CS~LJ8 
LEPROM~ (J 9 


end of module IO__CTRL__3 
PAL Codes: 10-3 (Continued) 
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Intel | AP-442 


module IO. CTRL4 flag ’-r3’ 


title ‘TO BUS CONTROLLER - PAL 2, INTEL CORPORATION’ 


104 device 'P16R6’ ; 

x = X.3 " ABEL ‘don’t care’ symbol 

c = yCk " ABEL ‘clocking input’ symbol 
" Inputs | 

CLK pin 1; “Processor Clock 

RESET pin 2; "System Reset 

LMIO pin 3; “Latched M/10# 

LOC pin 4; "Latched D/C# 

LWR pin 5; “Latched W/R# 

LALE pin 6; “Latched ALE 

delay pin 7; “Delay Signal for Wait State Generation 


unused_0 pin 8; 

unused_ 1] pin 9; " 
OEN~ pin 11; "PAL Output Enable 
unused_3 pin 12; " 

unused 4 pin 19; " 


" Outputs . 
IORDY- pin 13; "“I/0-EPROM Ready: . 
rdy~ pin 14; "I/0-EPROM Ready (n-1) 
rdy510~ pin 15;  "I/O-EPROM Ready (n-2) 
nc_0 pin 16; “ 
nc_l pin 17; " 
nc 2 pin 18; " 
rstate = [IORDY~, rdy~, rdy510~]; 
idle mofo de an hs 45 1 4; 
rdy2 eet “alts: B.S 0 4}; 
rdyl we -s e D xg 1 J; 
rdy0 = {0,1 , 1 j; 
illegal a = [ 1,0, 0 4}; 
iWiegal b = [ 0, 90 , 0 4]; 
illegal c = [ 0, O , 1 ; 
illegal d = [ 0,1 =, 0 ; 
state diagram rstate 
state idle: if (LMIO & !LDC & LWR & LALE) then rdyl else 
if !delay then rdy2.else idle; 
state rdy2: if RESET then idle else rdyl; 
state rdyl: if RESET then idle else 
if !LALE then rdyO else rdyl; 
state rdy0: goto idle; 


state illegal_a: goto idle; 
state illegal_b: goto idle; 
state illegal _c: goto idle; 


240725-C0 


PAL Codes: |0-4 
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state illegal d: goto idle; 


end I0_ CTRL 4; 
“1 
240725-C1 
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IO BUS CONTROLLER - PAL 2, INTEL CORPORATION 
Equations for Module 10 CTRL 4 


Device 104 


- Reduced Equations: 
'TIORDY~ := (IORDY~ & !LALE & !RESET & rdy510~ & !rdy~); 
Iydy~ := (IORDY~ & LALE & !RESET & rdy510~ & !rdy~ 


# IORDY~ & !RESET & !rdy510- & rdy- 
# IORDY- & LALE & !LDC & LMIO & LWR & rdy510~ & rdy-); 


Irdy510~ := (IORDY~ & !LALE & !delay & rdy510~ & rdy~ 
# IORDY~ & !ILWR & idelay & rdy510~ & rdy~ 
# IORDY~ & LOC & !delay & rdy510- & rdy~ 
# IORDY~ & !LMIO & !delay & rdy510~ & rdy~); 


PAL Codes: lO-4 (Continued) 
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IO BUS CONTROLLER—PAL 2, INTEL CORPORATION 
Chip diagram for Module IO__CTRL__4 


Device I04 


unused_0 LJ8 13 LJIORDY~ 


unused_1LJj9 12 LJ unused_3 


end of module IO__CTRL__4 © 
PAL Codes: |O-4 (Continued) 
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module LADDR_DEC flag ‘-r3’ 

title ‘LOCAL _DECODE_LOGIC - INTEL CORPORATION’ 
LADDR_PAL device ’P16L8’ ; 

"ABEL don’t care symbol 

"ABEL clocking input symbol 


"logic | 
"logic 0 


het ° 
7. OM 
we: e 


H Ww nw ou 


So 
~? 


"ADS# 

"M/10# 

"Addr bit 31 

4; “Addr bit 30 
"Addr bit 29 


" Outputs 
X16~ pin ; "indicates a 16-bit access 
LBA~ pin ; “local bus access 
NCA~ pin ;  "non-cache access 


equations 


1X16~ !ADS~ & M_I0~ & A31 & A30 & A29; 
LBA~ h; 
NCA~ h; 


end LADDR_ DEC; 
240725-C3 
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LOCAL_DECODE_LOGIC - INTEL CORPORATION 
Equations for Module LADDR_DEC 


Device LADDR_PAL 


- Reduced Equations: 
1X16~ = (A29 & A30 & A31 & !ADS~ & M I0~); 
ILBA~ = (0); 


INCA~ = (0); 
240725-C4 


PAL Codes: Local Decoder 
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LOCAL__DECODE__LOGIC—INTEL CORPORATION | 
Chip diagram for Module LADDR__DEC 


Device LADDR__PAL 


240725-73 


end of module LADDR__D EC 
' PAL Codes: Local Decoder (Continued) 
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module READY flag ’-r3’ 

title ‘READY LOGIC - INTEL CORPORATION’ 
RDY device ’P16L8’; 

" Inputs 
DRAMRDY~ pin 1; "DRAM READY# 
IORDY~ = pin ;  "IO/EPROM READY# 
RDYEN~ pin ;  “RDYEN# of 82385 
RDY385~ pin ;  “READYO# OF 82385 


RDY387~ pin ;  “READYO# OF 82387 
CACHE pin 6; “High if cache exits; otherwise, Low 


" Qutputs 


READY~ pin 12; "READY# for 80386 
BREADY~ pin 13; "BREADY# for 82385 


equations 


!BREADY~ = !DRAMRDY~ # !IORDY-; 

'READY~ = (CACHE & !RDY385~) # !RDY387~ # 
(CACHE & !RDYEN~ & (!DRAMRDY~ # !IORDY~) # 
ICACHE & (!DRAMRDY~ # !IORDY~)); 


end READY; 
240725-C5 


ABEL(tm) 3.10 - Document Generator 15-Feb-90 07:02 PM 
READY LOGIC - INTEL CORPORATION 
Equations for Module READY 


Device RDY 


- Reduced Equations: 
IBREADY~ = (!IORDY~ # !DRAMRDY~); 


'READY~ = (!CACHE & !IORDY~ 
# !CACHE & !DRAMRDY~ 
-# 'IORDY~ & !RDYEN~ 
# !DRAMRDY~ & !RDYEN~ 
# !RDY387~ 
# CACHE & !RDY385~) ; 
240725-C6 
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ABEL™ 3,10—Document Generator | 
READY__LOGIC—INTEL CORPORATION 
Chip diagram for Module READY 


Device RDY 


DRAMRDY~ Cj 1 
IORDY~ LJ 2 


RDY385~ C9 4 
RDY387~ L435 


end of module READY 
PAL Codes: Ready (Continued) 
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LOL-S 


ADS# 


Addr. 
W/R# 
HIT# 
T2X# 
TIP# 
RAS# 
CAS# 
DRAMRDY# 
NA¢ 
CAL 
DEN# 
Data 
WE# 
DT/R# 


Read Cycle, Pipelined Read Cycle, Pipelined Write Cycle, Pipelined 
DRAM Page Miss DRAM Page Hit DRAM Page Hit 


ES CASS See ae, SLSR ne e anes ne 
a Ney 


CECE EEC EEE ECE CEE 


Se IN ae as ey ee ee 
eR ee re ee Ne 
a a Naa, ease ND LON env OR ee Deere Se Pee aE 
Tn Be ee a Ne eC Vey) 
AAI he ce etree cee ne ne ee ee ee ee ee 


DRAM Cycle (R/W Hit/Miss) 


Cache = Low 
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SNOILVNOSA ONIWIL 
J XIGNAddV 


cvP-dV 


802-S 


386 
CLK2 


CLK 


ADS# 


Addr. 
W/R# 
HIT# 
T2X# 
TIP# 
RAS# 


CAS# 


DRAMRDY# 


NA# 

CAL 
DEN# 
- Data 
_ WE# 


DT/R# 


Read Cycle, Pipelined Read Cycle, Pipelined Write Cycle, Pipelined 
DRAM Page Hit DRAM Page Miss . ae . DRAM Page Miss | 
Tip =| SsiTrap — Tip | Tg | Te | Top | Tap Tip | T2 | Top | Top 


240725-76 


DRAM Cycle (Page Miss) 


ovr-dV 


602°S 


Read Cycle, Pipelined Read Cycle, Pipelined Write Cycle, Non=Pipelined Read Cycle, Pipelined 
DRAM Page Hit DRAM Page Hit DRAM Page Miss DRAM Page Hit 
386 Tip =| —SsiTap Tip | 19; Ty | 12 | T2 | Top | = Top Tip | sap 


cue2 J VL SVS VS VS VSI NVSNS NS NSN NIN NI NS NS NS NI NI NS NS NS NS 
CU Ne Nf Nee Ne Nef ae Nee Na ee Ne Nee NN 
ROSY On Ng Ne Ne ee ey 
RSM, a eee ea ae ee ee Re ees 
w/R# | ee Ni at 
HIT# 
TON Na Ne 
UE Nae ye ge ene pee ete ge eer er ON age CN 
RAS# ee a eee ne ee ree ere 
CAS Nee eee ef Ny Ney 


DRAMRDY# 


NA# 


CAL 


DEN# 
Data 


WE# 


DT/R# 


240725-77 
DRAM Cycle 


cvr-dvV 
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386 
CLK2 
CLK 
ADS# 
Addr. 
W/R# 
HIT# 
T2X# 
T1P# 
RAS# 
CAS# 
DRAMRDY# 
NA# 
CAL 
DEN# 
Data 
WE# 
DT/R# 
MUXOE# 


REF# 


Read Cycle, Pipelined Read Cycle, Pipelined 
DRAM Page Hit DRAM Refresh 


ee 


DRAM Refresh Cycle 
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CvP-dV 


386 


385 
CLK2 
CLK 
ADS# 
Addr. 
W/R# 
NA# 
CALEN 
CS# 
CWE# 
COE# 


Data 


CT/R# 
BADS# 
BACP 
BAOE# 
BAddr. 
BW/R# 
HIT# 
RAS# 
CAS# 
BNA# 
BREADY# 
RDY387# 
DEN# 


Data 


WE# 
DT/R# 
LDSTB 

DOE¥ 
BT/R# 
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Read, Cache Miss Read, Cache Miss 


| y y T2 | Top | Top | Top | Top Tip =| = Top | siTap 
DRAM Read DRAM Read, Page Hit 
Cache Write Cache Write 


| BT | BT BT, | Bz | 8Tpp | BTgp |  Blyp Blip | BT) S| BT ap 


P) 
Qa - Z 


Cache Read Data Cache Write Data Cache Write Data 


D 
am 


RA nat en ee Pee 
DRAM Read Data DRAM Read Data 
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Cache Cycle 
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386 


385 
CLK2 
CLK 
ADS# 
Addr. 
W/R# 
NA# 
CALEN 
CS# 
CWE# 
COE# 
Data 


CT/R# 
BADS# 
BACP 
BAOE# 
BAddr. 
BW/R# 
HIT# 
RAS# 
CAS# 
BNA# 
BREADY# 
RDY387# 
DEN# 
Data 
WE# 
DT/R# 
LDSTB 
DOE¥ 
BT/R¥ 
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Read, Cache Miss 


Read, Cache Hit 


Read, Cache Hit 


LT rr | Tip | oT Ty | Ty 
DRAM Read, Page Miss 
Cache Write 


Cache Hit Detected 


Cache Write Data Cache Read Data Cache Read Data 


DRAM Read Data 
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Cache Cycle (Continued) 
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Write, Cache Hit Write, Cache Hit Write, Cache Miss Read, Cache Hit 


Ty Ty F 


DRAM Write DRAM Write, Page Hit DRAM Write, Page Miss 
Cache Write Cache Write 


385 BT, BT, BTop . Blip | BT gy Blip | 
CLK2 
CLK 
ADS# 
Addr. 
W/R# 
NA# 
CALEN 
cS¥ 
We eee TC ye ee 
COE# 


Data 
Cache Write Data Cache Write Data Write Data 


CT/R# 
BADS# 
BACP 
BAOE# 
BAddr. 
BW/R# 
HIT# 
RAS# 
CAS# 
BNA# 
BREADY# 
RDY387# 
DEN# 
Data 
WE# 
DT/R# 
LOSTB 
DOE# 


BT/R# 
240725-81 


Cache Cycle (Continued) 
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VILL-S 


(penuuoyd) aj9AQ ayoRD 


Read, Cache Hit Read, Cache Miss Write, Cache Miss 
386 T2 LF | T2 Ty | Tg Top =f} sap of Top Taps Tip | 
DRAM Write, Page Miss DRAM Read , DRAM Write, Page Hit 
} . | Cache Write ; 
385 BT, Pp | BT, | BT, | BT, | BTo, BT, | BT, | BTop | BTop | BT op BT, P | 


CUZ JV SVS VS NS NS NI NS NI NI NI NI NI NS NI NI NI NI NI NI NI NI NS 
SN IPT a a ae eT ee he eS a 
DSc Nate ON Na a ON 
Lf SERENE ESS ORES POD. GEE RRS SSRESSSSSS EIDE TAS SS TRIES ER ORES ET OTE 
Oa a ee ee eee 
NAg OC False fr") im ; aaa 
CALEN / \ eae é \ | | | | / » eee 
CN ee 
a aaa aa Taig oe IN, ree 
Cob SS re Ne NaN a a 
—,_ 7 —— > 


mes Write Data Cache Read Data | . Cache Write Data Write Data 
BADS# : P 
a ae ae ee ee ne 


BAddr. _—_________/ 


pik 


HIT# 
- RAS# 


CAS# X | : : ——, 


SSN ee ey EARLE A 
DRAM Write Data DRAM Read Data 
WE# \ 
DT/R# 
LDSTB \ 
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crPr-dV 


386 


385 
CLK2 
CLK 
ADS# 
Addr. 
W/R# 
NA# 
CALEN 
CS# 
CWE# 
COE# 

| Data 
CT/R# 
BADS# 
BACP 
‘BAOEH 
BAddr. 
BW/R# 
HIT# 
RAS# 
CAS# 
BNA# 
: BREADY# 
RDY387# 
DEN# 
Data 
WE# 
DT/R# 
LDSTB 
DOE# 
BT/R# 
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Write, Cache Miss 

Tip | T9j Ty; 7 | 
DRAM Write, Page Hit 

BT ip | BT», | BT», 


ee ne ee ee 
§ DRAM Write Data 


) 


| a Ae na eee Cee 
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Cache Cycle (Continued) 
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| Ty sTop | To sTyp | T2 | T2 | Tp | T2 | To | T2 | Tp | 12 | Ty | T2 | 

eee FLL nny 

Of Vel VS eS Nel NT VT eT OT OTS OF OT OT 
ADS# 
CLK# 
Addr. 
ALE 
LAddr. 
EPROM# 
MRDC# 
DEN# 
DT/R# 
lORDY# 
NA# 
BS164 


Data 


CLK 
ADS# 
CLK# 
Addr. 

ALE 

LAddr. 
EPROM# 


MRDC# 


DEN# 
DT/R# 
lORDY# 
NA¥ 
BS164 


Data 
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EPROM and I/O Cycles 
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| Ty Top | To Typ | T2 | To | 4 To | T2 | T9 | To | Tp | LF | T9 


59CS# \ See a a Oe ee a a ee ee ae 
loRCcE __ \ _ = 


lowc# 
INTAY 
DENG Te Ne oe ee 
BIRR ee ee ee 
ORDYE ee Ney 
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EPROM and I/O Cycles (Continued) 
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lORDY# 
240725-89 


EPROM and I/O Cycles (Continued) 
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CLK2 
CLK 
ADS# 
CLK# 
Addr. 
ALE 
LAddr. 
510CS# 
lIORC# 
lIOWC# 
DEN# 
DT/R# 


lIORDY# 
240725-90 


CLK 
ADS# 
CLK# 
Addr. 

ALE 


LAddr. 


510CS# 


lORC# 

lowc# 
DEN# 

DT/R# 


lORDY# 
240725-91 


EPROM and I/O Cycles (Continued) 
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_ 7 APPENDIX D 
| TIMING EQUATIONS 


EQUATIONS FOR DRAM TIMINGS (NO CACHE 
CONFIGURATION): | 


Read and Write Cycles (Common Parameters): 


tRC: Random Read or Write Cycle Time 
CLK2 x 10 


tRP: RAS# Precharge Time 
CLK2 x4 _ 


tRAS: RAS# Pulse Width 
CLK2 x 4 


A random DRAM cycle may have a RAS# pulse 
which is only four CLK2 periods wide. This is the case 
if the cycle is followed by Idle cycles (DRAMs not 
selected or Ti’s) or a DRAM page miss. 


tCAS (Read): CAS# Pulse Width 
 CLK2 x3 


CAS# pulses can be as narrow as three CLK2 
cycles during Page Mode read cycles. 


tCAS (Write): CAS# Pulse Width 
CLK2 X 2 


CAS# pulses can be as narrow as two CLK2 cy- 
cles during Page Mode write cycles. 


tASC: Column Address Setup Time 


min (CLK2 x 2 + AS32.tphl.min — Delay.max — 
ACT258.StoZ.tpl.max — ACT258.Cap.Derating, CLK2 x 
3 + AS32.tphi.min — t6.max — 386.Cap.Derating — 
AS373.DtoO.tpd.max — ACT258.ItoZ.tpl.max — 
ACT258.Cap.Derating) | 


The Column Address becomes valid as RAS# 
switches from High to Low or as the 386 address be- 
comes valid while RAS# is already Low (i.e., Page 
Mode, Pipelined cycles) 


tCAH: Column Address Hold Time 


CLK2 + AS373.GtoO.tpd.min + ACT258.ItoZ.tpl.min — 
AS32.tphl.max 


The CAL (Column Address Latch) signal is acti- 
vated one CLK2 period after the active-going: edge of 
CAS#. | 


tAR: Column Address Hold Time to RAS # 
CLK2 x 3 + AS373.GtoO.tpd.min + 
ACT258.ItoZ.tpl.min — RAS.Delay.max 
tRCD: RAS# to CAS# Delay Time 
CLK2 x 2 + AS32.tphi.min — RAS.Delay.max 


tRAD: RAS# to Column Address Delay Time 


(min) ACT258.StoZ.tphi.min + Delay.min — 
RAS.Delay.max . 


(max) ACT258.StoZ.tphi.max + Delay.max + 
ACT258.Cap.Derating — RAS.Delay.min 


tRSH: RAS# Hold Time 
CLK2 X 2 — AS932.tphl.max + RAS.Delay.min 


The worst case occurs when a DRAM Page miss 


or Idle is detected at the end of the current DRAM 


Page miss cycle. 


tCSH: CAS# Hold Time 
CLK2 < 6 + AS32.tplh.min — RAS.Delay.max 


tCRP: CAS# to RAS# Precharge Time 
CLK2 x 2 + RAS.Delay.min — AS32.tplh.max 


This is guaranteed by the DRAM control state 
machine. 


tASR: Row Address Setup Time 


CLK2 xX 2 — t6.max — 386.Cap.Derating — 
ACT258.itoZ.max — ACT258.Cap.Derating + 
H124.tpd.min + H125.tpd.min + PAL.tco.min + 
RAS.Delay.min 


tRAH: Row Address Hold Time | 
ACT258.StoZ.tphi.min + Delay.min — RAS.Delay.max 


tT: Transition Time (Rise and Fall) 
tREF: Refresh Period 


tREF2: Refresh Period 
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Read Cycles: 


tRAC: Access Time 
CLK2 X 6 — H124.tpd.max — H125.tpd.max — 


PAL.tco.max — t21.min — F245.max — RAS.Delay.max 


tCAC: Access Time from CAS # 

CLK2 X 3 — H124.tpd.max — H125.tpd.max — 
PAL.tco.max — AS32.tphl.max — t21.min — F245.max 
tAA: Access Time from Address 


CLK2 x 6 — t6.max — 386.Cap.Derating — 
AS373.DtoO.max — ACT258.ItoZ.tp.max — 
ACT258.Cap.Derating — t21.min — F245.max 


tRCS: Read Command Setup Time 
CLK2 + AS32.tphi.min 


tRCH: Read Command Hold Time to CAS # 
CLK2 — AS32.tplh.max 


tRRH: Read Command Hold Time to RAS# 
CLK2 — RAS.Delay.max 


tOFF: Output Buffer Turn-off Time 
CLK2 X 2 + F245.tzh.min 


Write Cycles: 


tWCS: Write Command Setup Time 
CLK2 x 3 + AS32.tphi.min 


tWCH: Write Command Hold Time 
CLK2 x 2 — AS32.tplh.max 


tWCR: Write Command Hold Time to RAS # 
CLK2 x 6 — RAS.Delay.max 


AP-442 


tWP: Write Command Pulse Width 
CLK2 x 5 


tRWL: Write Command to RAS# Lead Time 
CLK2 x 5 + RAS.Delay.min 


tCWL: Write Command to CAS# Lead Time 
CLK2 x 5 


tDS: Data-in Setup Time 


CLK2 X 3 + H124.tp.min + H125.tp.min + 
AS32.tphi.min — T12.max — F245.tp.max 


tDH: Data-in Hold Time © 
CLK2 x 2 + F245.tpz.min — AS32.tphi.max 


tDHR: Data-in Hold Time to RAS # 
CLK2 X 6 + F245.tpz.max + RAS.Delay.min 


Page Mode Cycles: 


tPC: Page Mode Cycle Time 
CLK2 x 4 


tRAPC: Page Mode RAS# Pulse Width 
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CLK2 x 4 


tRSW: RAS # to Second WE# Delay Time 
CLK2 x 7 — RAS.Delay.max 


tCP: CAS# Precharge Time 
CLK2 . 


tWI: Write Invalid Time 
CLK2 


tCAP: Access Time from Column Precharge Time 


CLK2 X 4 — H124.tp.max — H125.tp.max — 
PAL.tco.max — t21.min — F245.max 


Intel | AP-4420 : 


80386 A.C. SPECIFICATIONS 


80386-33 
Parameter Minimum Maximum 


_ Operating Frequency 8.00 33.33 
ti CLK2 Period 15.00 62.50 
t2a CLK2 High Time 6.25 
t2b CLK2 High Time 4.50 
t3a CLK2 Low Time 6.25 
t3b CLK2 Low Time 4.50 
t4 ‘CLK2 Fall Time 4.00 
t5 - CLK2 Rise Time 4.00 
t6 A2-A31 Valid Delay 4.00 15.00 
t7 A2-A31 Float Delay 4.00 20.00 
ts BEO#-BE3#, LOCK# Vaiid Delay 4.00 15.00 
t9 BEO#-BE3#, LOCK# Float Delay 4.00 20.00 
t10 W/R#, M/IO#, D/C#, ADS# Valid Delay 4.00 15.00 
t11 W/R#, M/IO#, D/C#, ADS# Float Delay 4.00 25.00 
t12 ' DO-D31 Write Data Valid Delay 5.00 24.00 — 
t13 DO-D31 Float Delay 4.00 17.00 
t14 HLDA Valid Delay 4.00 20.00 
t15 NA# Setup Time 5.00 
t16 NA# Hold Time 3.00 
t17 BS16# Setup Time 5.00 
t18 BS16# Hold Time 3.00 
t19 Ready# Setup Time 7.00 
t20 Ready# Hold Time 4.00 
t21 DO-D31 Read Setup Time 5.00 
t22 DO-D31 Read Hold Time . 3.00 
t23 HOLD Setup Time 11.00 
t24 HOLD Hold Time .3.00 
t25 RESET Setup Time 8.00 
t26 RESET Hold Time 3.00 
t27 NMI, INTR Setup Time 5.00 
t28 NMI, INTR Hold Time -§.00 
t29 PEREQ, ERROR#, BUSY# Setup Time 5.00 
£30 PEREQ, ERROR#, BUSY# Hold Tine 4.00 


OSS SESE Ee SSE SEALS ER et EAA EE NT SETAE LE TES AE TE ELIA TER OE EE SE IE LE SD i he I a ae ae cane ean a Ge Sarre ears lao noae eae oat Se Soe TS EER Se SO TE EE LE EE TIE IBN ae ee ME LS ck eee ee eS OE 


PAL SPECIFICATIONS 


Symbol Parameter. Minimum Maximum 
ts Input or Feedback Setup Time 7.00 

tco Clock to Output 3.00 6.50 
comes ee eee Sere Sr as See men ee ee eel Sal Ses Sa ae NOOSE aT A Sa Oe EO Sa SEN SST SE SSS NN OE 


ROW ADDRESS LATCH SPECIFICATIONS 
74FCT843B (IDT) 


50 pF 
Symbol Parameter Minimum Maximum 
tpih Dn to On Propagation Delay 3.00 6.50 
tphl 3.00 6.50 
tplh G to On Propagation Delay 6.00 8.00 
tphl 4.00 8.00 
ts Setup Time 2.00 
th Hold Time : 3.00 


os: oe i ey seen mane care mem “rom ten es ny semen er cannstes em om Soe me cay nh sn st in SO a te me st Ht sn cri Snes nti me hr So eb mi nen ae 
a ree ee earn ee ee Se een a ee ee a a a as Se a Sas a Sa See alms Sa ma ee Sree ee oo as ad oe nee ae 
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Timings for No Cache Configuration 
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ROW ADDRESS COMPARATOR SPECIFICATIONS 
74PCT521B (Performance) 


Minimum Maximum 


An or Bn to Q Propagation Delay 


I to Q Propagation Delay 


DRAM ADDRESS MULTIPLEXER SPECIFICATIONS 
T4ACT258 


Parameter 
S to Zn Propagation Delay 
E# to Zn Propagation Delay 


In to Zn Propagation Delay 


SSeS Vas == SS So eR EES SS ESE SE BE IS SE SS AA A ERLE SSS SS SSS SS SESS ST 


DATA TRANSCEIVER SPECIFICATIONS 
T4E245 


Minioum Maximum 
An to Bn or Bn to An Propagation Delay 
Output Enable Time 


Output Disable Time 


COLUMN ADDRESS LATCH SPECIFICATIONS 
74AS573 


Parameter 
Dn to On Propagation Delay 
G to On Propagation Delay 


Setup Time 
Hold Time 


Scio Ss SSS SSS SS SS EE ESS SS SSS VAS SS SS SSS SSVI VAI SESS SSS aE 


RAS# DELAY 


Minimum Maximum 
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Timings for No Cache Configuration (Continued) 


5-723 


AP-442 


OR SPECTFICATIONS 


14AS32 

Symbol Parameter Minimum Maximum 

ioe ~=«CBeepadetigd Detays Oe 1.00 5.80 

tphl 1.00 5.80 : 

worm ee eee oe cee eee eee ee anne eee eee See eee ESTER AAS ees ss Sees Sess SESE TSS SSS SSS tsa =s= 


DRAM TIMING REQUIREMENTS 
For 80386-33 Timing Margin (NMB 2801-06) 


- Symbol Parameter Minimum Maximum Minioum Maximum 
Read and Write Cycles (Common Parameters) : 
tRC Random Read or Write Cycle Time 150.00 29.00 
tRP RAS# Precharge Time 60.00 5.00 
tRAS RAS# Pulse Width . 60.00 0.00 
tCaAs CAS# Pulse Width (Read) : 45.00 34.00 
tCAS CAS# Pulse Width (Write) 30.00 25.00 
tASC Column Address Setup Time 9.70 9.70 
tCAH Column Address Hold Time : 14.20 8.20 
tAR Column Address Hold Time to RAS# ‘ 50.00 10.00 
tRCD RAS# to CAS# Delay Time 31.00 25.00 14.00 
tRAD RAS# to Column Address Delay Time - 5.00 21.30 1.00 6.70 
tRSH RAS# Hold Time 24.26 mes 9.20 
tCSH CAS# Hold Timea 91.00 _ §1.00 
tCRP CAS# to RAS# Precharge Time 24.20 21.20 
tASR Row Address Setup Time §.45 3.45 
tRAH Row Address Hold Time §.00 3.00 
tT Transition Time (Rise and Fall) 
tREF Refresh Period : 
tREF2 Refresh Period , ‘a 
Read Cycles: 
tRAC Access Time 68.25 8.25 
tCac Access Time from CAS# 17.45 6.45 
tAA Access Time from Address 41.20 9.20 
tRCS -. Read Command Setup Time 16.00. 16.00 
tRCH Read Command Hold Time to CAS# 9.20 9.20 
tRRE Read Command Hold Time to RAS# 15.00 ex 15.00 . 
tOrF Output Buffer Turn-off Time ; 33.00 16.00 
Write Cycles: 
tWcs Write Command Setup Time 46.00 46.00 
tWCH Write Command Hold Time 24.20 19.20 
tWCR Write Command Hold Time to RAS# 90.00 50.00 
tWP Write Command Pulse Width 75.00 70.00 
tRWL Write Command to RAS# Lead Time 75.00 62.00 
tCwWL Write Command to CAS# Lead Time 75.00 70.00 
tDs Data~in Setup Time 17.75 17.75 
tDH Data-in Hold Time 26.20 21.20 
tDHR Data-in Hold Time to RAS# 97.50 57.50 
Page Mode Cycles; 
tec Page Mode Cycle Time 60.00 23.00 
tRAPC Paga Mode RAS# Pulse Width 60.00 — 
tRSW RAS# to Second WE# Delay Time 105.00 
tCP CAS# Precharge Time ; 15.00 10.00 
tWI Write Invalid Tima 15.00 
tCAP Access Time from Colum Precharge Time 38.25 4.25 
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Timings for No Cache Configuration (Continued) 
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ADDRESS DECODER REQUIREMENTS 


For 80386-33 
Minimum Maximum 


Available Propagation Delay 
SSeS SST Ss 2S TS SS SSS S22 SSS SS LS SS SSS SS Se eS StS 232 SS SS SSS SSS SS SSS STS 
ROW ADDRESS COMPARATOR REQUIREMENTS 


For 80386-33 
Minimum Maximum 


Available Propagation Delay 


ee a ee ee ee ES ESTAS TESTE SATAN SEES SSS 
NA# SETUP TIME 
Minimum Maximum 


Available NA# Setup Time 


SRE ERS eee a EE EE SS SS SSE SS EES SS SEE 


QUAD TTL TO 1LOKH-ECL TRANSLATOR 
MC10H124 


Propagation Delay 


Meet eee erro eee ae ee SE RE SS SS RSS SE EE LE cre 


QUAD 10KH-ECL to TTL TRANSLATOR 
MC10H125 


Symbol Parameter 


tpd Propagation Delay 


Soe SRR 


DELAY ELEMENT 


Propagation Delay 4.00 


a Sa a a SS ES ARTS EEE EE SEE I 
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Timings for No Cache Configuration (Continued) 
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DRAM SPECTFICATIONS 


: NMB 2801-06 VITELIC V53C256 (70 ns} 
Symbol Minimum Maximum Minimum Maximum 
tRC 121.00 130.00 
tRP 55.00 50.00 
tRAS 60.00 100000 70.00 75000.00 
tCAS 11.00 15, 20 75000.00 
tCAS 5.00 
tASC 0.00 0.00 
tCAH 6.00 15.00 
tAR 40.00 §5.00 
tRCD 6.00 45.00 25.00 55.00 
tRAD 4.00 28.00 20.00 35.00 
tRSH 15.00 15, 25° 
tCSH 40.00 70.00 
tCRP 3.00 15.00 
tASR 2.00 0.00 
tRAH 2.00 15.00 
tT 3.00 25.00 
tREF 
tREF2 
tRAC . — 60.00 70.00 
tCAC 11.00 15.00 
tAA 32.00 35.00 
tRCS 0.00 0.00 
tRCH 0.00 §.00 
tRRH 0.00 5.00 ‘ 
tOFrF 17.00 0.00 15.00 
twos 0.00 0.00 
tWCH §.00 15.00 
tWCR 40.00 55.00 
tWe 5.00 15.00 
tRWL 13.00 20.00 
tCWL §.00 20.00 
tds 0.00 0.00 
tDH §.00 15.00 
tDHR 40.00 55.00 
trc 37.00 50.00 
tRAPC 
tRSW 
tcP 5.00 15.00 
twr 
tCAP 34.00 45.00 
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CAPACITIVE LOAD TIMING DERATING FOR 74ACT258 

Load Capacitance (pF) Additional Propagation Delay (ns) 
0.02625q - 1.3125) 
0.022q - 1.3125) 


0.01666q + 0.1666) 


UbhAHWWNHNE FOO OG 


DRAM ADDRESS BUS TIMING DERATING 


Capacitive Load (pF) Additional Propagation Delay (ns) 
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DRAM Address Inputs 
F258 Output 
Microstrip/Strip Lines 


TOTAL 
nna 
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Timings for No Cache Configuration (Continued) 
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EQUATIONS FOR DRAM TIMINGS (82385 Ac- 
tive): 


Read and Write Cycles (Common Parameters): 


tRC: Random Read or Write Cycle Time 
CLK2 x 10 


tRP: RAS# Precharge Time 
CLK2 x 4 


tRAS: RAS# Pulse Width 
CLK2 x 4 


A random DRAM cycle may have a RAS# pulse 
which is only four CLK2 periods wide. This is the case 
if the cycle is followed by Idle cycles (DRAMs not 
selected or Ti’s) or a DRAM page miss. 


tCAS (Read): CAS# Pulse Width 
CLK2 x 5 


CAS# pulses can be as narrow as five CLK2 cy- 
cles during Page Mode read cycles. 


tCAS (Write): CAS# Pulse Width 
CLK2 x 2 


CAS# pulses can be as narrow as two CLK2 cy- 
cles during Page Mode write cycles. | 


tASC: Column Address Setup Time 


min (CLK2 X 2 + AS32.tphl.min — Delay.max — 
ACT258.StoZ.tpl.max — ACT258.Cap.Derating, CLK2 x 
3 + AS32.tphI.min — t6.max — 386.Cap.Derating — 
AS373.DtoO.tpd.max — ACT258.ItoZ.tpl.max — 
ACT258.Cap.Derating) 


The Column Address becomes valid as RAS# 
switches from High to Low or as the 386 address be- 
comes valid while RAS# is already Low (ie., Page 
Mode, Pipelined cycles) | 


tCAH: Column Address Hold Time 


CLK2 + AS373.GtoO.tpd.min + ACT258.ItoZ.tpl.min — 
AS32.tphi.max 


The CAL (Column Address Latch) signal is acti- 
vated one CLK2 period after the active-going edge of 
CAS#. | 
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tAR: Column Address Hold Time to RAS# 
CLK2 X 3 + AS373.GtoO.tpd.min + 
ACT258.ItoZ.tpl.min — RAS.Delay.max 
tRCD: RAS# to CAS# Delay Time 
CLK2 < 2 + AS32.tphi.min — RAS.Delay.max — 


tRAD: RAS# to Column Address Delay Time 


(min) ACT258.StoZ.tphi.min + Delay.min — 
RAS.Delay.max 


(max) ACT258.StoZ.tphl.max + Delay.max + 
ACT258.Cap.Derating — RAS.Delay.min 
tRSH: RAS# Hold Time 

CLK2 x 2 — AS922.tphl.max + RAS.Delay.min 


The worst case occurs when a DRAM Page miss 
or Idle is detected at the end of the current DRAM 
Page miss cycle. 


tCSH: CAS# Hold Time 
CLK2 X 6 + AS32.tphI.min — RAS.Delay.max 


tCRP: CAS# to RAS# Precharge Time 
CLK2 x 2 + RAS.Delay.min — AS32.tplh.max 


This is guaranteed by the DRAM control state 
machine. 


tASR: Row Address Setup Time 


CLK2 X 2 — t6.max — 386.Cap.Derating — 
ACT258.ltoZ.max — ACT258.Cap.Derating + _ 
H124.tpd.min + H125.tpd.min + PAL.tco.min + 
RAS.Delay.min 


tRAH: Row Address Hold Time 
ACT258.StoZ.tphi.min + Delay.min — RAS.Delay.max 


tT: Transition Time (Rise and Fall) 
tREF: Refresh Period 


tREF2: Refresh Period 
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Read Cycles: oe , tWP: Write Command Pulse Width 

; | CLK2 x 5 
tRAC: Access Time 
- CLK2 X 8 — H124.tpd.max — H125.tpd.max — tRWL: Write Command to RAS# Lead Time 
PAL.tco.max — — F245.max — AS646.tpd.max — ; 
F245.max — RAS.Delay.max — SRAM.DW — CLK2 + CLK2 x 5 + RAS.Delay.min 
385.t22a.min 

. | tCWL: Write Command to CAS# Lead Time 
tCAC: Access Time from CAS # ~CLK2 x 5 
CLK2 x 5 — H124.tpd.max — H125.tpd.max — : 

PAL.tco.max — AS32.tphl.max — F245.max — tDS: Data-in Setup Time 
385.t22a.min AS32.tphl.min — — 385.t43c.max — 


| AS646.GotO.tp.max — F245.tp.max 
tAA: Access Time from Address 


CLK2 x 8 — t6.max — 386.Cap.Derating — tDH: Data-in Hold Time 


AS373.DtoO.max — ACT258.ItoZ.tp.max — — CLK2 x 2 + F245.toz.min — AS32.tphi.max 
ACT258.Cap.Derating — F245.max — AS646.tpd.max — , | 


SEKe: + AS32.tphi.min Page Mode Cycles: 


tRCH: Read Command Hold Time to CAS# 
CLK2 — AS32.tpih.max 


tPC: Page Mode Cycle Time 
CLK2 x 6 

tRRH: Read Command Hold Time to RAS# 
CLK2 — RAS. Delay.max 


tRAPC: Page Mode RAS# Pulse Width 
CLK2 x 4 
tOFF: Output Buffer Turn-off Time 
~CLK2 x 2 + F245.tzh.min 


tRSW: RAS# to Second WE# Delay Time 
CLK2 X 7 — RAS.Delay.max 


Write Cycles: tCP: CAS# Precharge Time 


tWCS: Write Command Setup Time CLK2 


SER at pede pan tWI: Write Invalid Time 


tWCH: Write Command Hold Time | CLK2 


CLK2 x 2 — AS32.(pih.max tCAP: Access Time from Column préshates Time 


tWCR: Write Command Hold Time to RAS# CLK2 x 6 — H124.tp.max — H125.tp.max — 
PAL.tco.max — — F245.max — AS646.tpd.max — 


CLK2 x 6 — RAS.Delay.max F245.max — SRAM.tDW — CLK2 + 385.t22a.min 
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DRAM TIMING REQUIREMENTS 


For 80386-33 Timing Margin (NMB 2801-06) 
Parameter Minimum Maximum Minimum Maximum 


Write Cycles (Common Parameters) : 
Random Read or Write Cycle Time 
RAS# Precharge Time 

RAS# Pulse Width 

CAS# Pulse Width (Read) 

CAS# Pulse Width (Write) 

Column Address Setup Time 

Column Address Hold Time 

Column Address Hold Time to RAS# 
RAS# to CAS# Delay Time 

RAS# to Column Address Delay Time 
RAS# Hold Time 

CAS# Hold Time 

CAS# to RAS# Precharge Time 

Row Address Setup Time 

Row Address Hold Time 

Transition Time (Rise and Fall) 
Refresh Period 

Refresh Period 


Read Cycles: 

tRAC Access Time 

tCAc Access Time from CAS# 

tAA Access Time from Address 

tRCS Read Command Setup Time 
tRCH Read Command Hold Time to CAS# 
tRRH Read Command Hold Time to RAS# 
tOFF Output Buffer Turn-off Timea 


Write Cycles: : 

twces Write Command Setup Time 

tWCH Write Command Hold Time 

tWCR Write Command Hold Time to RAS# 
tWPe Write Command Pulse Width 

tRWL Write Command to RAS¥ Lead Time 
tCcwL Write Command to CAS# Lead Time 
tDs Data-in Setup Time 

tDH Data-in Hold Time 

tDHR Data-in Hold Time to RAS# 


Page Mode Cycles: 

tPc Page Mode Cycle Time 

tRAPC Page Mode RAS# Pulse Width 

tRSW RAS# to Second WE# Delay Time 

tcP CAS# Precharge Time 

twr Write Invalid Time 

tCAP Access Time from Column Precharge Time 
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Timings with Cache Active 
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386™ SL MICROPROCESSOR SuperSet 
Highly-Integrated Static 386™ Microprocessor 
Complete ISA Peripheral Subsystem 
System-Wide Power Management 


m Static 386™ CPU Core — System I/O Decoding, Programmable 
— Runs MS-DOS*, WINDOWS*, OS/2* Chip Selects and Support Interfaces 
and UNIX* — High-Speed Peripheral Interface Bus 
— Object Code Compatible with Intel - (PI-Bus Support) | 
8086, 80286 and 386™ — New ideaPort Interface for Hardware 
Microprocessors Expansion 
m Architecture Extension for Power _ mt Integrated Cache Controller and Tag 
Management Transparent to Operating RAM 
Systems and Applications — No-Glue Cache SRAM Interface 
) , — 16k, 32k, or 64 kByte Cache Size 
& ee ISA System, with Extended — Direct, 2-Way or 4-Way Set 
— Full ISA Bus Control, Status and Associative Organization 
Address and Data Interface Logic, m Programmable Memory Control 
with Full 24 mA Drive — No-Glue, Page-Mode DRAM Interface 
— Compatible ISA Bus Peripherals — SRAM Support for Lowest Power 


— 512k to 32 MBytes 
— Full Hardware LIM EMS 4.0 


The 386™ SL Microprocessor SuperSet combines an ISA bus compatible personal computer’s microproces- 
sor, memory controller, cache controller and peripheral subsystems into just two Very Large Scale Integration 
(VLSI) devices. The product’s high-integration and power conservation features reduce the size and power 
consumption typically associated with fully Industry Standard Architecture (ISA) bus compatible systems. In 
addition, new expandability and flexibility features offer the capability for continued innovation in battery-oper- 
ated, space-constrained systems. The SL SuperSet brings 100% ISA-Bus compatibility to system designs 
ranging from the smallest palm-top and notebook PCs to expandable lap-top systems. 

386 is a registered trademark of Intel Corporation. 

*MS-DOS and WINDOWS are trademarks of Microsoft Corporation. 
UNIX is a trademark of AT&T. . . 

OS/2 is a trademark of International Business Machines Corporation. 
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240814-1 
Figure 1-1. Die Photograph of the 386™ SL Microprocessor (left) 
and 82360SL ISA Peripheral I/O (right) 
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-386T™ SL MICROPROCESSOR 
— 386TM Microprocessor Core, with 
integrated Bus Memory, and Cache Coniroiiers; and 
_ System Power Management | 
Fully-Static CHMOS IV Technology 


m Static 386™ CPU Core | 
— Optimized and Compatible with 


Standard Operating System Software 


such as: 
_MS-DOS*, WINDOWS’, OS/2* and 
UNIX* 

— Object Code Compatible with Intel 
8086, 80286 and 3867 
Microprocessors 

— Runs All Desk-Top Appleavens, 
16- or 32-Bit 

- — D.C. to 20 MHz Operation 

— 32 Megabytes Physical Memory/ | 
64 Terabytes Virtual Memory 

— 4 Gigabyte Maximum Segment Size 

— High Integration, Low Power Intel 
_CHMOS IV Process Technology 


| Transparent cone area nen: 

System Architecture 

— System Management Mode 
Architecture Extension for Truly 
Compatible Systems 

— Power Management Transparent to 
Operating Systems and Application 
Programs 

— Programmable Hardware Supports 
Custom Power-Control Methods 


Direct Drive Bus Interfaces 

— Full ISA Bus Interface, with 24 mA 
Drive 

— High Speed Peripheral Interface Bus 


integrated Cache Controller and Tag 

RAM 

— No-Giue Cache SRAM Interface 

— 16k, 32k, or 64 kByte Cache Size 

— Direct, 2-Way or 4-Way Set 
Associative Organization 

— Write Posting—Double Posted Writes 
in the Bus Controller 

— 16-Bit Line Size—Reduces Bus 
Utilization for Cache Line Fills 

— Write-Thru, with SmartHit Algorithm 
for Reduced Main Memory Power 
Consumption 


Programmable Memory Control 

— No-Glue, Page-Mode DRAM Interface 

— SRAM Support for Lowest Power 

— 1, 2, or 4 Banks Interleaved, with 
Programmable Wait States 

— 512k to 32 MBytes 


— Advanced, Flexible Address-Map 


Configuration 

— Full Hardware LIM EMS 4.0 Address 
Translation to 32 Megabytes meoUr 
Waltstate Panay, a 
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82360SL I/O Subsystem 
Complete ISA Peripheral Subsystem 
Integrated System Power Management 
Fully-Static CHMOS IV Technology 


= Complete ISA System, with Extended 

Support 

— Full ISA Bus Control, Status and 
Address and Data Interface Logic, 
with Full 24 mA Drive 

_ — Compatible ISA Bus Peripherals: 
Two 8237 Direct Memory Access 
Controllers — | 
Two 8254 Programmable Timer 
Counters (6 Timer/Counter 
Channels) 

‘Two 8259A Programmable 
interrupt Controllers 

(15 Channels) 

Enhanced LS612 Page Memory 
Mapper 

One 146818 Real Time Clock 
w/256-byte CMOS RAM 

One 16450 Dual Serial Port 
Controller 

One 8-Bit Parallel! 1/O Port 
(Centronics or Bi-Directional) 

— Additional System I/O Decoding, 
Programmable Chip Selects and 
Support Interfaces: 

Full integrated Drive Electronics 
(I1.D.E.) Hard Disk Interface 
Floppy Disk Controller 


m Keyboard Controller Chip Selects and 

Support Logic 

— External Real Time Clock Support 

— PS/2 and EISA Control/Status Ports 

— Local Memory and ISA-Bus Memory 
Refresh Control 

— New ideaPort Interface for Hardware 
Expansion 


m Transparent Power-Management 

System Architecture 

— Architecture Extension for Truly 
Compatible Systems 

— Transparent to Operating Systems 
and Applications Programs 

— Programmable Hardware Supports 
Custom Power-Control Methods 

— Integrated Power Management Unit 
Manages Power-Events Safely 
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386™ SL Microprocessor SuperSet 
-386™ SL CPU and 82360SL I/O 
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1.0 INTRODUCTION 


This document provides the pinouts, signal descrip- 
tions, and D.C./A.C. electrical characteristics of the 
386™ SL CPU and 82360SL ISA I/O Peripheral de- 
vice. Consult Intel for the most recent design-in in- 
formation. For a thorough description of any func- 
tional topic, other than the parametric specifications, 
please consult the latest 386 SL Microprocessor 
SuperSet System Design Guide (Order No. 240816), 
and the 386 SL Microprocessor SuperSet Program- 
mer’s Guide (Order No. 240815). 


Overview 


The 386™ SL Microprocessor SuperSet is an ex- 
tremely flexible pair of components marking a new 
milestone in microcomputer technology. Included in 
the pair are a 386 Architecture Central Processing 
_ Unit (CPU), several memory subsystem controllers, 


address translation and remapping logic, an optional - 


cache memory controller, and an extensive collec- 
tion of ISA bus compatible peripheral functions. 


386™ SL MICROPROCESSOR SuperSet 


ADVANCE INFORMATION 


The SL SuperSet allows the personal computer de- 
signer to take advantage of the highest level of sys- 
tem integration, while preserving complete freedom 
in selecting system features, power/performance 
trade-offs, and value-added enhancements. 


Essentially, all of the components needed to build 
an ISA bus compatible personal computer have 
been combined within just two components: the 
386 SL microprocessor and memory control system, 
and the 82360SL ISA peripheral 1/O and power 
management subsystem. The only other compo- 
nents needed for a complete personal computer are 
the main DRAM or optional static memory subsys- 
tem, optional cache SRAM and a graphics control- 
ler. A minimal amount of commodity Small Scale In- 


~ tegration (SSI) logic or Medium Scale Integration 


(MSI) logic buffers may be required for design-spe- 
cific interface to peripheral devices on the ISA bus. 


Systems based on the SL SuperSet typically include 


the functional blocks shown in Figure 1-2. 
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386™ SL Microprocessor 82360SL ISA peuerorg Vo 


Power 


Peripher 2 x 8237 | : 


Math | CPU Power 1. 


Coprocessor -—————$—$— > Management icieits DMA Supply 
(Optional) ‘ Logic ee g Controller Control 
| ment Logic [m 


Logic 


High= ms 146818 a L. 
Speed. Parallel Parallel 
Cache Ee censre) a i . 1/0 Ports fy7- 'Printe? 
~ (Optional) : Intel 386™ | | i 
oe Central fy bos 
Processing || | -_————] 
Main Unit 
Memory 
Control 
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Figure 1-2. 386™ SL Microprocessor-Based System Functional Block Diagram 
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Bank 4 Cache Bank 3 Cache Bank 2 Cache Bank 1 Cache 


TAG RAM TAG RAM TAG RAM TAG RAM Numerics 


Interface 
Logic Clocks, 


Reset, 
CPU/NPX, 
State Machine 


Controller 


TAG Comparator TAG Comporator TAG Comparator TAG Comparator Random 
Hit/Miss Logic — Hit/Miss Logic Hit/Miss Logic Hit/Miss Logic Logic 
Cacheability Map Cacheability Map Cacheobility Map Cacheability Map 
LRU and Flush logic 


| Programming 
1 Interface 


State machine 


Decode ' 
Address Bus Sine aise 
controller /decode , Address Mapper 


bmw mma 


Data bus contro! Write Poster 
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PI bus I/F control Static 386! Microprocessor Core DRAM controller 
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Figure 1-3a. 386™ SL Microprocessor Internal Functional Modules 
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386™ SL Microprocessor: Central 
Processing Unit (CPU) and Memory 
Controller Subsystem 


The 386 SL microprocessor is a highly-integrated, 
complete microprocessor and memory controller 
subsystem. At the heart of the 386 SL microproces- 
sor is a CHMOS static 386 CPU core. The 386 CPU 
core has been fully optimized to reduce run-time 
power requirements, and includes a key architectur- 
al extension required by battery-operated systems. 


The 386 SL processor is the first member of the 386 
microprocessor product line to implement a CPU 
with the System Management Mode extension. The 
System Management Mode is a new CPU operating- 
mode which allows system vendors to rid their sys- 
tems of the backwards-compatibility problems that 
plague battery-operated PCs. This 386 architecture 
extension eliminates portable-system conflicts by 
providing a safe, new operating level for the battery 
management firmware developed by system design- 
ers. With the 386™ SL CPU, firmware will execute 
transparently to every application, operating system 
and CPU mode, thus avoiding the compatibility con- 
flicts which were once unavoidable. 


The 386 SL microprocessor retains the paged-mem- 
ory-management system, and all other key features 
which are common to the Intel886™ architecture. In 
addition, on-chip hardware implements the Expand- 
ed Memory Specification (E.M.S.) address transla- 
tion compatible with the current Lotus/Intel/Micro- 
soft (L.I.M.) E.M.S. 4.0 standard. Additional address- 
mapping and control logic integrated in the 386 SL 


CPU allows BIOS ROMs to be “‘shadowed” by faster 


_ memory devices, and supports a variety of common 
memory roll-over and back-fill schemes. The 386 SL 
CPU contains all of the control and interface logic 
needed to directly drive large main memory and an 
optional cache memory subsystem. 


The 386 SL CPU contains bus drivers and control 
circuitry for two expansion interfaces. A Peripheral 
Interface Bus (Pl-Bus) provides high-speed commu- 
nication with devices which may reside on the same 
printed circuit board as the processor. The Industry 
Standard Architecture (ISA) bus provides a common 
interface for the wealth of third party ISA bus com- 
patible |/O peripheral and expansion memory add-in 
boards. On-chip data-byte steering logic, address 
decoding and mapping logic automatically routes 
each memory or |/O operation to the appropriate 
local memory, cache, PlI-Bus or ISA expansion bus. 


All system configuration logic in the 386 SL proces- 
sor subsystem is initialized under software control. 
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The system designer only has to program the proc- 
essor in order to support multiple system hardware 
designs where many devices of less flexibility were 
once required. System characteristics such as mem- 
ory type, size, speed, organization, and mapping; 
cache size, organization and mapping; and peripher- 
al selection, configuration and mapping are config- 
ured under software control. Thereafter, all memory 
and |/O transfer requests are automatically sent to 
the appropriate memory space or expansion bus, ful- 
ly-transparent to existing operating system software 
and application programs. 


Figure 1-3a shows the functional blocks and Figure 
1.3b shows the microarchitecture of the 386 SL 
processor. 


82360SL I/O: Integrated ISA Peripheral 
and Power Management Device 


The 82360SL ISA Peripheral I/O contains dedicated 
logic to perform a number of CPU, memory, and pe- 
ripheral support functions. The 82360SL device also 
contains an extensive set of programmable power 
management facilities which allow minimized system 
energy requirements for battery-powered portable 
computers. 


The 82360SL includes a complete set of on-chip pe- 
ripheral device functions including two 16450 com- 
patible serial ports, one 8-bit Centronics interface or 
bi-directional parallel port, two 8254 compatible tim- 
er counters, two 8259 compatible interrupt control- 
lers, two 8237 compatible DMA controllers, one 
74LS612 compatible DMA page register, one 
146818 compatible Real-time clock/calendar with 
256 bytes of battery backed CMOS RAM and an 
integrated drive electronics (IDE) hard-disk-drive in- 
terface. The Intel 82360SL also contains highly pro- 
grammable chip selects and complete peripheral in- © 
terface logic for direct keyboard, FLASH memory 
and floppy disk controller support. The peripheral 
registers and functions behave exactly as the dis- 
crete components commonly found in industry-stan- 
dard personal computers. The peripheral logic is en- 
hanced for static operation by supporting write only 
registers as read/write. 


The processor and memory support functions con- 
tained in the 82360SL device eliminate most of the 
external random-logic ‘‘glue’” that might otherwise 
be required. The 82360SL device provides internal 
programmable-frequency clock generators for the 
CPU, backplane, and video subsystems. A program- 
mable, low-power DRAM refresh timer is also provid- 
ed to maintain system memory integrity during the 
power saving system stand-by and suspend states. 
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The 82360SL also contains a flexible set of hard- 
ware functions to support the growing sophistication 
in power management schemes required by portable 
systems. Numerous hardware timers, event moni- 
tors and 1/O interfaces can programmably monitor 
and control system activity. Firmware developed by 
the system designer allocates and directs the hard- 
ware to fulfill the unique power management needs 
of a given system configuration. 


Suspend/Resume State Machine 
System Power Management Timers 


All of the standard peripheral registers, clock-gener- 
ation logic, and power-management facilities have 
been designed to ensure complete compatibility with 
existing operating systems and annlications soft- 
ware. 7 


Figure 1-4 shows the functional blocks and micro- 
architecture of the 82360SL I/O subsystem. | 
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Figure 1-4a. 82360SL ISA Peripheral I/O Internal Functional Modules 
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2.0 PIN ASSIGNMENTS AND SIGNAL outs in the 227 pin Land Grid Array (LGA). The sec- 


| ond table lists the 82360SL package device pinouts 
CHARACTERISTICS in the 196 lead JEDEC Plastic Quad Flat Package 
Section 2 provides. information for the SL SuperSet (PQFP). Both tables include additional information 
pin assignment with respect to the signal mnuemon- for the signals and associated pin numbers. A brief. 


ics. In addition to the package pin out diagrams, two explanation of each column of the table is given in 
tables are provided for easy location of signals. The Table 2-1. 
first table lists the 386 SL CPU package device pin- 


Table 2-1. Description of the Columns of Tables 2-2 and 2-3 - 7 
PQFP This column lists the pin numbers of the 82360SL in a Plastic Quad Flat Package. — 
LG 


This column lists the pin numbers of the 386 SL CPU in a Land Grid Array. | 


Signal Name This column lists the signal name associated with the package pins 


_ Indicates whether the pin is an Input (1), an Output (O). or an Input-Output (IO). 


Term Specifies the internal terminator on the pin. This could be an internal pull-up or pull- 
a down resistor value or a hold circuit. To find out whether a pull-up or a pull-down is 


provided, use the STPCK (Stop Clock) column. 


Specifies the drive current loy (Current Output Logic High) and Io, (Current Output 
_ Logic Low) in milli-Amperes (mA) for output (O), and bi-directional (IO), pins. 


Drive 


This column lists the maximum specified capacitive load which the buffer can directly 
drive in pico-Farads (pF) for each signal. This is specified for output and input-output 
pins only. | < 


This column specifies the state of the pin during a suspend operation. Input signals. 
have the representation Tri/x where x is either a logic 0 or logic, 1. This indicates that 
the input is internally isolated and that the internal termination on the pin is tri-stated 
or disabled. When in Suspend Mode an external logic value x is forced to the internal 
logic. The input can be driven to the same logic HIGH or LOW state by external logic 
with no current source or sink. The additional output buffer abbreviations are 
explained below. 
Tri. - Tristated 

Actv. - Active 

0 — -held low 
1 -heldhigh 
Hold - held at last state 


This column specifies the state of the pin when the clock signal CPUCLK is internally 
stopped in the 386 SLCPU. . | 


Pu - Pulled up 

Pd - Pulled down 

Drv _ - Driven high, low or at the last state 

Actv_ - Active (Signal is driven and continues to operate or change logic states) 
This column specifies the state of the pin when the ONCE # (On Circuit Emulator) pin 


is asserted, allowing in-circuit testing while the device is still populated on the logic 
board. , | 


Tri - Floats 
Actv. - Active 
0 - held low 
1 - held high 


Hold - held at last state 


Derating 


This column specifies which derating curve(1) is used for each output buffer 
Curve | 


associated with the pin. 


NOTE: | : 
1. For more information on derating curves and how to use them, see Section 8 (Capacitive Derating Information). 


5-742 


Intel 


A 
B 
C 
D 
E 
F 
G 
H 
J 
K 
L 
M 
N 
P 
Q 
R 

Ss 
U 
V 
wi 
X 


386™ SL MICROPROCESSOR SuperSet 


ADVANCE INFORMATION 
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Figure 2-1. Pin Assignments of the 386™ SL CPU in the 227-Lead LGA Package 
(Top View—Land Pattern Facing Down, Component Marking Facing Up) 


5-743 


Intel = —sse™ si micRopROcEssoR Superset ADVANCE INFORMATION 


% (=) z 4 # = a N N tua $ 2 
ema ehStieseaas, 22 ou pS Se Se asp s of ee ee ee 
B2SZSRSEBELERP ERE GE GBSESSISSSSERE GE RPERPEERCEEPPEPES 8 
, > oat PGanvnaoaootinrrnr~arenrereeereFeF Fee tiH#Gbrisaorsasasasastaasasa aaa aad > 
7 \ TOARAAAONAONAAAANAAAAnAnAAAAAnAnAnnAAnAnnnAnnannvannnna rN 
PIN 196 
Vocq Kaman PIN PIN 147 EJ Vga 
TESTO# Co : amend LP TERRORA 
XDEN* Co ; J LPTDO 
xorR Lo eae LPTAF DH 
x07 lemme LPTSTROBEH 
so7 CJ ' Perec PERRY 
sp6 Co) ———"] CPURESET 
sos CJ , —— NMI 
suc , "J INTR 
sot] "J HRO 
so2 Co : "J HLDA 
HS J DMAS/ 164 
sb0 Cd Pent AZOGATE 
sA16 fo J INTAW 
sats Co 3} HAL T# 
Vssoq Co pend Vecs 
sats Cl : med STPCLKF 
Yoo Co pnd Vss7 
sAt3 fC  mmenenadl MHF 
3A12 CO Pd REFREQ 
SAtI ld ad COMBDTRE 
sa10 Co Pent’ COMBRIF 
sag i] comacTS# 
stl 82460SL Cet COMBRXD 
va (ed COMBRTS# 
a (TOP VIEW COMPONENT | coun 
Ay: \o 3 GROMER nad COMBDSRY 
sas MARKING FACE UP) 1 comapcos 
s3 : tend COMADTRE 
Auk ’ fC ———_] COMAR 
sAILN : at COMACTS# 
Yssi Lo , pl Vec7 
SA0 flo Seed OSC 
Voc Comma p——) Vss6 
ZEROWS# Cd Cand COMARXD 
tOCHRDY C=) mwa’ COMARTS# 
AEN LL leet COMATXD 
SMEMWe CO Load COMADSR# 
SMEMR# [a PJ} COMADCD¥ 
owe Lo SMOUTS 
lor# Cod eed RESETORV 
DACK3# [2] ed SMOUTS 
ora3 Co J swouts. 
DACKt# "9 SMOUT2 
orgie Cm HDCSOM 
REFRESH* [ld Peel HDCS 14 
sSYSCLK Lo ____.} HO7 
RO? JT Ras 
Vgg2 f———}- PIN 49 J Voce 
: PIN 98 
bed = nor = on = - +. o =. wn 
sf2 4 %* SRE gia Ee Bae ae 2 ee ee 
8 ; 
240814-3 


Figure 2-2. Pin Assignments for the 82360SL in a 196-Lead Plastic Quad Fiat Package 
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| Table 2-2. 386T™ SL CPU Pin Characteristics 
LGA Pin# | SignalName | Type 
A02 STPCLK # 
CA8 


=| 
o 
= 
3 
r 
° 
» 
a. 
Oo 
@ 
bom j 
i) 
=1 
=] 
ie} 
2) 
c 
= 
< 
@ 


> 
ro) 
a) 


7 
A08 ~CD5 
AO9 CD7 
A11 CA7 
2 
A13 


mH 


> 
= 
o 
z 
- 
Nh 
e 
on 
ze oe 
o|o 
Qa | & 


CA9 
A14 NPXCLK 
A15 NPXW/R#_ 


or 
om 

aie | npxRDy# | 1 | 6k | | — | trit| Pu | Tri/t | 
| 0 | Hol 


A17 CA10 


w 
co) 
vo) 
© 
O 
Oo 


Hold Hol 
Hold Hol 


_ 


CA13 


rot 
O 
2 
ak 
o 
z 


o 
(ok 
EN 
NO 
ms 
o 
z 
O O 
3 3 
a a 
o o 
jek [or 


ee 
a7 
[a0 ooo | 10 | Ho | a2 | so | wot | ow | Hold | 
[800 [eos 10 | Hota 

=e 

<x 

es 

— 


= 
rn 


2. Tri/O indicates a tri-stateable output with pull-down. 
3. CMUX 8-11 (RASxx#) are ACTIVE when the 386™ SL CPU Memory Controller is programmed in the DRAM controller 
mode with Suspend Refresh enabled. Otherwise, these signals are HOLD. 
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Table 2-2. 386™ SL CPU Pin Characteristics (Continued) 
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Table 2-2. 386T™ SL CPU Pin Characteristics (Continued) 
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Table 2-3. 82360SL Pin Characteristics (Continued) 
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Table 2-3. 82360SL Pin Characteristics (Continued) 
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NOTE: 
1. Programmable, active only when suspend refresh is enabled. 
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Table 2-3. 82360SL Pin Characteristics (Continued) 
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3.0 SIGNAL DESCRIPTIONS 


i a il, mi ee 


366'™ SL Microprocessor 


The following table provides a brief description of the signals of the 386 SL CPU. Signal names which end with 
the character “#” indicate that the corresponding signal is low when active. 


Symbol Name and Function | 


A20GATE | A20 Gate: This active HIGH input signal controls the 386 SL CPU A20 address line. When 
HIGH this signal forces the 386 SL CPU to mask off (force LOW) the internal physical address 
signal A20. When this signal is LOW, the internal physical address signal A20 is available on 
the System Address (SA) bus. When A20 gate is inactive this allows emulation of the 8086 

1 Mbyte address “wrap-around”. 


Bus Address Latch Enable (ISA bus signal): This active HIGH output signal is used for two 
purposes. BALE is used to latch the address lines on the LA bus (LA17-—LA23) on the falling 
edge of BALE. BALE is also used to qualify ISA bus cycles for signals on the Peripherial 
Interface (Pl) bus (PM/IO# and PW/R #). On the falling edge of BALE, PM/IO# and PW/R# 
can be sampled to determine the type of ISA bus cycle that is going to occur. BALE may be 
used to qualify and generate buffered control and status signals to the ISA expansion bus. The 
PI bus signal decoding is as follows: 


Type of Bus Cycle PM/IO# .PW/R# 


Memory Read 
Memory Write 
I/O Read 


/O Write 

Interrupt Acknowledge 
HALT (address = 2)* 
Shutdown (address = 0)* 


*Note that BALE is not generated for these cycles, however the PM/IO# and PW/R# will 
reflect these states during HALT and Shutdown bus cycles where BALE is driven in typical ISA 
bus systems. Memory read/write, |O read/write and interrupt/i meee eexnowlodge cycles 

correspond to the standard ISA bus cycle. 


BUSY: This active LOW input signal indicates a busy condition from a math co-processor — 
(MCP). | 


Cache Address Bus: This is the address bus output used to select the memory cell in the 
cache memory. The CA2 signal is also connected to the CMD0O # input of the MCP indicating 
Opcode (when high) or Data (when low) during a write cycle and control/status register (high) 
or data register (low) during a read. CA2 is used to address the upper or lower DWORD port of 
the MCP. 


Cache Chip Select High Byte: This active LOW output is used to enable the upper byte of the 
cache SRAMs. This signal should be connected to the upper oye cache SRAM chip-select 
input. 


BUSY # 


CA[15:1] 


CCSH # 
Cache Chip Select Low Byte: This active LOW output is used to enable the lower byte of the 
cache SRAMs. This signal should be connected to the lower byte cache SRAM chip-select _ 


CCSL # 
input. 


| Col 5:0] | Cache Data Bus: This is the bi-directional data bus used to transfer data between the cache » 
| SRAMs and the 386 SL CPU. Tne Cache Data bus is aiso used to transfer data between the 
MCP and the 386 SL CPU. 
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386™ SL "eon Signal Descriptions (Continued) 
| Symbol | Name and Function 


CMUX0O CPU Multiplexed Pin Zero: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller then this pin becomes 
“CASL3 #” and should be connected to the lower byte of DRAM bank 3 CAS # input. 
When the 386 SL CPU Memory Controller Unit is configured as an SRAM controller this 
signal becomes the direction control (DIR) and should be connected to the direction 
control input of the SRAM data transceiver. 


CMUX1 CPU Multiplexed Pin One: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller then this pin becomes 
“CASH3 #” and should be connected to the upper byte of DRAM bank 3 CAS # input. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 


signal becomes ‘‘LE” and should be connected to the latch enable input of the SRAM 
address latch. This pin is disabled when SUS__STAT # is active (LOW) and the system is 
not performing a suspend refresh operation. When the pin is disabled the output is 
sustained at the previous state by internal “keepers’”’. 


CMUX2 CPU Multiplexed Pin Two: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller this pin becomes “CASL2 #” 
and should be connected to the lower byte of DRAM bank 2 CAS # input. 

When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 


pin becomes “DEN3#” and should be connected to the data transceiver enable input for 
bank 3 of the SRAM memory subsystem. _ 


This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal ‘keepers’. 


CMUX3 CPU Multiplexed Pin Three: This output signal has two functions. When the 386 SL CPU 
_ Memory Controller Unit is configured as a DRAM controller this pin becomes ““CASH2 #” 
and should be connected to the upper byte of DRAM bank 2 CAS # input. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 


pin becomes “DEN2#” and should be connected to the data transceiver enable input for 


_ bank 2 of the SRAM memory subsystem. 
This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal ‘keepers’. 


CPU Multiplexed Pin Four: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller this pin becomes “CASL1 #”’ 
and should be connected to the lower byte of DRAM bank 1 CAS # input. 

When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “DEN1 #” and should be connected to the data transceiver enable input for 
bank 1 of the SRAM memory subsystem. 

This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 


CPU Multiplexed Pin Five: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller this pin becomes “CASH1 #”’ 
and should be connected to the upper byte of DRAM bank 1 CAS # input. 

When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “DEN1 #” and should be connected to the data transceiver enable input for 
bank 1 of the SRAM memory subsystem. 

This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 
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386™ SL Microprocessor Signal DascHations (Continued) 


| Symbol _| Name and Function 


CMUX6 CPU Multiplexed Pin Six: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller this pin becomes ‘““CASLO#”’ 
and should be connected to the lower byte of DRAM bank 0 CAS # input. | 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “DENO#” and should be connected to the data transceiver enable input for 
bank 0 of the SRAM memory subsystem. | 
This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers’”’. 


CPU Multiplexed Pin Seven: This output signal has two functions. When the 386 SL 

CPU Memory Controller Unit is configured as a DRAM controller this pin becomes 
“CASHO#” and should be connected to the upper byte of DRAM bank 0 CAS # input. | 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 

pin becomes “DENO#”’ and should be connected to the data transceiver enable input for | 
bank 0 of the SRAM memory subsystem. 

‘This pin is disabled when SUS__STAT # is active (LOW) and the system is not 

performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 


CPU Multiplexed Pin Eight: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit-is configured as a DRAM controller this pin becomes “RAS3 #” 
and should be connected to the upper and lower byte of DRAM bank 3 RAS # inputs. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller then 
this pin becomes “CE3#” and should be connected to the upper and lower byte of the 
SRAM chip-select, or to the chip- are decode logic for bank 3 of the SRAM memory | 
subsystem. 

This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal ‘‘keepers’”’. 


CPU Multiplexed Pin Nine: This output signal has two functions. When the 386 SL CPU 
Memory Controller Unit is configured as a DRAM controller this pin becomes ““RAS2 #” 
and should be connected to the upper and lower byte of DRAM bank 2 RAS # inputs. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes ‘“‘CE2#” and should be connected to the upper and lower byte of the 
SRAM chip-select, or to the chip-select decode logic for bank 2 of the SRAM memory 
“subsystem. 

This pin is disabled when SUS__ STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 


CMUX10 CPU Multiplexed Pin Ten: This output signal has two functions. When the 386 SL CPU 

Memory Controller Unit is configured as a DRAM controller this pin becomes ““RAS1 #” 
~ and should be connected to the upper and lower byte of DRAM bank 1 RAS # inputs. 

When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “‘CE1 #” and should be connected to the upper and lower byte of the 
SRAM chip-select, or to the chip-select decode logic for bank 1 of the SRAM memory 
subsystem. | 
This pin is disabled when SUS__STAT # is active (LOW) and the system is not 


| periorming a suspend refresh operation. When ihe pin is disabied the ouipui is sustained 
| | | at the previous state by internal “keepers”. | | 
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386™ SL Microprocessor Signal Descriptions (Continued) 


Symbol Name and Function 


CMUX11 CPU Multiplexed Pin Eleven: This output signal has two functions. When the 386 SL 
CPU Memory Controller Unit is configured as a DRAM controller this pin becomes 
“RASO #” and should be connected to the upper and lower byte of DRAM bank 0 RAS # 
inputs. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “CE0#” and should be connected to the upper and lower byte of the 
CMUX12 
CMUX13 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 
CMUX14 CPU Multiplexed Pin 14: This output signal has two functions. The 386 SL CPU can be 
configured to use this pin as either a BIOS ROM chip-select (ROMCS1 #), or a FLASH 
| disk chip-select signal (FLSHDCS #). In either case, the signal is driven LOW when an 
access to the selected interface occurs. 
CPURESET » 


SRAM chip-select, or to the chip-select decode logic for bank 0 of the SRAM memory 


subsystem. 
This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
DMA8/16# 


performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 


CPU Multiplexed Pin Twelve: This output signal has two functions. When the 386 SL 
CPU Memory Controller Unit is configured as a DRAM controller this pin becomes 
“PARL” and should be connected to the lower byte of DRAM bank 0 data parity bit. 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin becomes “OLE #” and should be connected to the lower byte of the SRAM output 
enable input of the SRAM memory subsystem. 

This pin is disabled when SUS__STAT # is active (LOW) and the system is not 
performing a suspend refresh operation. When the pin is disabled the output is sustained 
at the previous state by internal “keepers”. 


CPU Multiplexed Pin Thirteen: This output signal has two functions. When the 386 SL 
CPU Memory Controller Unit is configured as a DRAM controller this pin becomes 
“PARH” and should be connected to the upper byte of DRAM bank 0 data parity bit: 
When the 386 SL CPU Memory Controller Unit is configured as a SRAM controller this 
pin. becomes “OHE #” and should be connected to the upper byte of the SRAM output 
enable input of the SRAM memory subsystem. 

This pin is disabled when SUS__STAT # is active (LOW) and the system is not 


Cache Output Enable: This active LOW output signal is used to indicate a read access 
to the CACHE SRAMs, and is used to enable the cache SRAMs’ output buffers. This 

signal should be connected to the output enable signals of the upper and lower byte 
cache SRAMs. 


CPU Reset: This active HIGH input forces the 386 SL CPU to execute a reset to the 
internal CPU core and state machines. The configuration registers are not reset. 


Cache Write Enable: This active LOW output is used to indicate a read (HIGH) or write 
(LOW) access to the cache SRAMs. This signal should be connected to the write enable 
signal of the upper and lower cache SRAMs. 


DMA 8-bit or 16-bit Cycle: This input, in conjunction with HRQ, indicates to 386 SL CPU 
if an 8-bit or 16-bit DMA access is occurring. If an 8-bit DMA access is occurring, the 

386 SL CPU will swap the upper byte of data to the lower data byte for upper byte 
accesses. 


External Frequency Input. This is an oscillator input. This clock controls all CPU core 
and memory controller timings and is equal to twice the desired processor frequency 
(CLK2 vs CPUCLK). 
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Symbol 7 Name and Function | 


ERROR # Numerics ERROR: This active LOW input to the 386 SL CPU is generated from a math 
co-processor (MCP). It also indicates to the 82360SL that an unmasked exception has 
occurred in the MCP. ERROR # is provided to allow numerics error handling compatible 
with the ISA bus compatible Personal Computer. 


HALT: This active LOW output indicates to external devices that the 386 SL CPU has 
executed a HALT instruction (address = 2) or a shutdown condition (address = 0). This 
can be used as.an indicator for devices to assert the STPCLK # signal. 


HoLD Acknowledge: This active HIGH output indicates to external devices that the 
386 SL CPU has relinquished control of the ISA bus. At this time the 386 SL CPU has 
floated the address and control signals of the ISA bus. . 


Hold ReQuest: This active HIGH input indicates to the 386 SL CPU that an external 
device wishes to take control of the ISA bus. 7 


INTerrupt Acknowledge: This active LOW output indicates that the 386 SL CPU is 
executing an interrupt acknowledge bus cycle. During this process an external interrupt 
device will pass an interrupt vector to the 386 SL CPU. 


INTR interrupt Request: This active HIGH input indicates to the 386 SL CPU that an external 
: device is requesting the execution of an interrupt service routine. 


IOCHRDY 1/O CHannel ReaDY: This active HIGH input indicates that the I/O Channel, (ISA 
expansion bus), is ready to terminate the bus cycle. The ISA expansion bus is a normally 
ready bus and IOCHRDY is active HIGH. When an ISA bus peripheral needs to extend 


the standard 3 SYSCLK, 16-bit ISA bus cycle the peripheral device asserts IOCHRDY 
LOW. 
lIOCS16# 


I/O Chip Select 16: This active LOW input indicates that an ISA bus peripheral wishes to 
execute a 16-bit 1/O cycle. This signal has an active pull- up, when not driven the default 
~ISACLK2 


1/O bus cycle is 8 bits. 
LA[23:17] 


I/O Read: This active LOW signal mae? that the ISA bus is executing an I/O read 
MAI10:0] 


cycle. 


I/O Write: This active LOW signal indicates that the ISA bus is anne an |/O write 
cycle. 


ISA Clock Two: This is an oscillator input. This clock controls all of the ISA bus timings 
_and is equal to twice the SYSCLK frequency. Normally the ISA bus SYSCLK is 8 MHz, 
and the ISACLK2 osciliator is 16 MHz. | 


Latchable local Address bus: This is the unlatched local address of the ISA-bus for 
access to memory above 1 megabyte. The LA bus is also used by the Peripherial 
Interface (Pl) Bus. 


Memory controller Multiplexed Address bus: This is the address bus output for the 
Memory Controller Unit. The 22-bit address is output in a row/column fashion for both 
DRAM and SRAM memory subsystems. The Memory Controller Unit places the ROW 
address out first and qualifies it by the RASx# signal going active in DRAM mode or the 
LE signal going active in the SRAM mode. The column address is then placed on the 
Memory Address bus and is qualified by the CASXx# signals going active for the DRAM 
mode. 

This pin is disabled when SUS__STAT # is active (LOW). When the pin is disabled the 
output is sustained at the previous state by internal ‘‘keepers’’. 
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386™ SL <<“ —e aaa Signal Descriptions (Continued) 
| Symbol | Name and Function 
MASTER # 


MEMCS16 # 
MEMR # 
MEMW # 


Master: This active LOW input indicates that an ISA bus peripheral is controlling the bus. 
The peripheral device asserts this signal in conjunction with a DMA request (DRQ) line or 
the HRQ (hold request) to gain control of the bus. When the MASTER # signal is 

asserted LOW along with HRQ being asserted HIGH or a DRQ line being asserted HIGH, 
the 386 SL CPU will float all address, data and control signals on the ISA bus. 


Memory controller local Memory Data bus: This is the bi-directional data bus of the 
Memory Controller Unit. All accesses by the Memory Controller Unit that transfer data 
between the 386 SL CPU and SRAM or DRAM use the Memory Data Bus. 

This pin is disabled when SUS__STAT # is active (low) and the system is not performing 
a suspend refresh operation. When the pin is disabled the output is sustained at the 
previous state by internal ‘keepers’. | 


MEMory Chip Select 16: This active LOW input indicates that an ISA bus peripheral 
wishes to execute a 16-bit memory cycle. This signal has an active ase up, when not 
driven the default memory bus cycle is 8 bits. 


MEMory Read: This bi-directional active LOW signal indicates when a memory read 
access Is taking place on the ISA bus. When the 386 SL CPU is performing a memory 

read to the ISA bus it is an output, when the DMA or Bus Master is accessing memory on 
the ISA bus, the DMA device or Master drives MEMR#. 


MEMory Write: This bi-directional active LOW signal indicates when a memory write 
access is taking place on the ISA bus. When the 386 SL CPU is performing a memory 

write to the ISA bus it is an output, when the DMA or Bus Master is accessing memory on 
the ISA bus, the DMA controller or Bus Master drives MEMW#. 


No connection: These pins must not be connected to any voltage, but must be left 
floating in order to guarantee proper operation of the 386 SL CPU and to maintain 
compatibility with future Intel Processors. 


N/C | 


Non-Maskable Interrupt: This rising edge sensitive input will latch a request to the 
386 SL CPU for a non-maskable interrupt on a LOW-to-HIGH transition. 


NPXADS # Numerics ADdress Strobe: This active LOW output signal indicates the start of a math 


co-process (MCP or NPX, numerics processor extension) data transfer cycle. 


NPXCLK Numerics Clock: This output signal is used to drive the MCP clock input. 


NPXRDY # Numerics Ready: This active LOW input is used to terminate a MCP (or NPX, numerics 
| processor extension) bus cycle. This signal is low for |/O and data operand MCP cycles. 


NPXRESET — Numerics Reset: This active HIGH output signal is used to reset the MCP. 


NPXW/R# Numerics Write or Read: This output signal indicates the type of data transfer that is 
being performed between the 386 SL CPU and the MCP. When high this signal indicates 


a MCP write, when low this signal indicates a MCP read. 


ONCE # | ON-board Circuit Emulation: This active LOW input signal floats the neccessary 
outputs from the 386 SL CPU allowing an in-circuit emulation (ICETM-386™ SL) module 
to drive the 386 SL CPU signals. This allows an emulator to be used for system testing 
and developement while the 386 SL CPU and the 82360SL are still physically populated 


on the system motherboard. The state of all 386 SL CPU and 82360SL signals when 
ONCE # is asserted low is summarized in section 2, (386 SL CPU and 82360SL signal 
characteristics). 
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| Symbol | - Name and Function | | 


POMD # ri-DUS Command: T This active LOW ouipui indicates thai vaiid write data is on the 
| System data bus (SD[15:0]) signals, or that the 386 SL CPU is ready to gl valid read 
data from the PI bus for Peripherial Interface bus cycles. 


— 3 Processor Extension Request: This active HIGH output signal indicates that the 386 SL 


CPU has data to transfer to or from the MCP data FIFO. 


PERR# Parity ERRor: This active LOW output indicates to an external device that the 386 SL 
| CPU Memory Controller Unit has detected a memory parity error. The PERROR # signal 
is used by the 82360SL to generate NMI back to the 386 SL CPU. 
PM/IO# PI-BUS Memory or I/O: This output indicates the type of bus cycle the 386 SL CPU is 
executing on the Peripherial Interface Bus. ein bus): Either a Memory (HIGH) or I/O 
(LOW) cycle. | 
PRDY # PI-BUS Ready: This active LOW input is used to terminate Peripherial Interface bus 
- cycles. The Peripheral Interface Bus is a normally not-ready bus, and will continue the 
bus cycle until the PRDY # is activated or a Peripherial Interface Time-out occurs. 


PI-BUS START: This active LOW output indicates that the address (SA[19:0], LA[23:17] 
and SBHE #), command signals (PM/IO# and PW/R#) and chip-selects (VGACS # or 
FLSHDCS #) are valid for a Peripheral Interface Bus cycle. 


PI-BUS Write or Read: This output indicates the type of bus cycle the 386 SL CPU is 
executing on the Peripheral Interface Bus: Either a Write (HIGH) or Read (LOW) cycle. 


Power Good: This active HIGH input indicates that power to the system is.good. This 
signal is generated by the power supply circuitry, and a LOW level on this signal causes | 
the 386 SL to totally reset: The CPU core is reset, internal state machines are reset, all 
configuration registers are reset. 


Power Good should be low for a specified minimum number of CPU clocks for valid 
recognition in order to perform a global 386 SL CPU reset. 


REFresh REQuest: This active HIGH input indicates that the 386 SL CPU should 


| REFREQ 
execute an internal DRAM refresh cycle to the on-board local memory. 


ROM16/8# ROM 16-bits or 8-bits: This input configuration signal pin selects if the BIOS interface is 
a 16-bit (when high) or 8-bit interface (when low). This pin has an internal pull- ap resistor. 
defaulting to a 16-bit wide BIOS EPROM. | 

ROMCS0 # ‘ROM Chip Select 0: This LOW true output provides the chip select for the System BIOS 


EPROM. 
SAI[19:0] 


System Address Bus: This is the bi-directional system address of the ISA bus, as well 
SBHE # 


as the Peripheral Interface Bus. SA[16:0] are inputs during DMA and Master operation. 
SA[19:17] are outputs only since a 8237 compatible DMA controller accesses up to 
SD[15:0] 


PSTART # 
PW/R# 


PWRGOOD 


64 kBytes at a time. The 74LS612 module in the 82360SL is used to furnish the DMA 
upper addresses for DMA access to 16 Megabyte. 


System Byte High Enable: When this output signal is LOW, it indicates that data is being 
transferred on the upper byte of the 16-bit data bus (SD[15:8]). 


System Data Bus: This 16-bit bi-directional data bus is used to transfer data between the 
386 SL CPU and the ISA bus. The system data bus is also used to transfer data between 
the 386 SL CPU and the Peripherial Interface (PI-BUS). 


System power Management interrupt: This falling edge sensitive input latches a Power 
Management interrupt request with a High-to- new edge. The SMI # is the highest priority 
interrupt in the 386 SL processor. 
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Name and Function 


SMRAMCS # System power Management RAM Chip Select: This active LOW output is used to 
select an external system power management SM-RAM, and to indicate to the 82360SL 


device when accesses to the system power management SM-RAM are occurring. 


Stop Clock: This active LOW input stops the clock to the internal 386 CPU core. (This 
signal is functionally tested by the execution of HALT or I/O read instructions.) 


System Clock: This is a clock output equal to one half of the ISACLK2 input frequency. 


SUSpend STATus: This active LOW input indicates to the 386 SL CPU that system 
power is being turned off. The 386 SL CPU will respond by electrically isolating selected 
pins as indicated in Section 2, (386 SL CPU signal characteristics). 


Turbo: This active HIGH input signal indicates to 386 SL CPU when to enter ‘“‘Turbo 
Mode’’. Turbo Mode is defined as the CPU executing at full speed, the default speed for 
the system. When this signal is forced inactive LOW, the 386 SL CPU executes from a 
divide by two or a divide by four clock as defined by the De-turbo bit in the 
CPUPWRMODE register. When this signal is HIGH, the CPU executes from a clock as 
defined by the Fast CPU clock field in the CPUPWRMODE register. 


STPCLK # 


SYSCLK 
SUS__STAT # 


System Power: Provides the + 5V nominal D.C. supply inputs. 
VGACS # VGA Chip-select: This active LOW output is asserted anytime an access occurs to the 
user defined VGA address space. 
Vss System Ground: Provides the OV connection from which all inputs and outputs are 
referenced. | 
| WHE# Write High Enable: This active LOW output indicates that a write access to the upper 
byte of the 386 SL CPU memory bus is occurring when the Memory Controller Unit is 
| configured for SRAM mode. When in DRAM mode, the signal is active anytime a write 
access occurs. This output should be connected to the write enable of the upper byte for 
either DRAM or SRAM memory subsystems. This pin is driven during a suspend 
operation. | 
| WLE# Write Low Enable: This active LOW output indicates that a write access to the lower 
byte of the 386 SL CPU memory bus is occurring when the Memory Controller Unit is 
| configured for SRAM mode. When in DRAM mode, the signal is active anytime a write 
access occurs. This output should be connected to the write enable of the lower byte for 
| | either DRAM or SRAM memory subsystems. This pin is driven during a suspend 
| operation. 


ZEROWS # is driven low, a 16-bit bus cycle will occur in two SYSCLKs. When 
ZEROWS # is driven low for an 8-bit memory cycle the default 6 SYSCLK bus cycle is 
shortened to 3 SYSCLKs. 


ZEROWS # ZERO Wait State (ISA bus signal): This active LOW input indicates that an ISA bus 
peripheral wishes to execute a zero wait state bus cycle (the normal default 16-bit ISA 
‘bus memory or I/O cycle is 3 SYSCLKs or one PC/AT equivalent wait state). When 
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3.0 SIGNAL DESCRIPTIONS (Continued) 


| 8036081. ISA Peripheral Vee 
The following table provides a brief description of the signals of the 82360SL I/O. Signal names which end 
with the character “#”’ indicate that the corresponding signal is low true when active. 


Symbol | Name and Function 7 


A20GATE » A20 Gate (direct to CPU): This active HIGH output signal forces the 386 SL CPU to 
AEN 


mask off A20 on the system address bus (internal to the 386 SL CPU), to allow emulation 
BALE 


of an 8086. 
BAT TDEAD # 


Address ENabled (ISA-bus signal): This active HIGH output indicates a DMA access, 
BATTLOW # 


refresh or I/O access to a non-standard ISA peripheral I/O address location. The 
82360SL drives this signal high to signify a valid DMA address. It is used by bus slaves to 
decode |/O ports. All ports must be decoded for AEN low. There are no DMA cycles to 

addressed I/O ports. 


Buffered Address Latch Enable Saou signal): This active HIGH input to the 
82360SL is driven by the.386 SL CPU during standard ISA bus cycles. During ISA bus 
memory and |/O cycles BALE is used to indicate valid addresses at the start of a bus 
cycle. SA[19:0] are valid on the falling edge and LA[23:17] are valid while BALE is high. 
BALE is is also driven high by the 386 SL CPU and remains high during DMA cycles. 


BATTery DEAD: This active LOW input indicates that the battery does not have enough 
power to resume or reset. This signal will prevent a system reset if asserted LOW. 


BATTery LOW: This active LOW input indicates that the battery power is low. 
BATTLOW # is typically driven by a D.C. to D.C. power converter associated with the 

battery power supply. A thermal power monitor indicates that the main battery power is 
| dropping below the adequate charge level to sustain operation. If this signal is asserted 


LOW with BATTWRN# asserted LOW a SMI request will be generated. The feature is 
BATTWARN# 


-enabled via S/W control. The signal will also prevent a resume operation if asserted 
| C8042CS # | 


LOW. 
COM(A,B)CTS# 


BATTery WARNing: This active LOW input indicates the battery has minimal charge left 
(eg. one half an hour of full power use remaining). 


Keyboard controller Chip Select: This active LOW output is driven when there is an |/O 
read or write to the Keyboard Controller Ports 60 or 64 hex. 


Clear To Send: This active LOW input indicates to the Serial Port Controller for COMA or 

_| COMB that a serial device is clear to accept data. This signal is typically used for a 

| modem control function. A change in the state of this signal generates a modem status 
interrupt. The modem or data set asserts this signal when it is ready to accept data for 

transmission. . 


COM(A,B)DCD # 


COM(A,B)DSR # 
COM(A,B)DTR # 


| COM(A,B)RXD_ | 


Data Carrier Detect: This active HIGH input indicates that the Serial Port Controller 
COMA or COMB has detected a data carrier from the data set of a serial device. Typically 
this signal is from a modem. 


Data Set Ready: This active LOW input signal is used by the modem or data set to 
indicate that the modem or data set is ready to establish the communication link and 
transfer data with the Serial Port Controller. 


Data Terminal Ready: This active LOW output signal informs the modem or data set that 
the Serial Port Controller is ready to communicate. 


Serial data Receive: This input signal is used to receive serial data. Each character can 
consist of from five to eight bits of data with one start bit and one, one and a half or two 
stop bits. The least significant bit is received first. 
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82360SL ISA <—""an ee I/O Signal Descriptions (Continued) 


Name and Function 


__- B)RI# Ring Indicator: This active LOW input signal is used for a modem control function. A 
change in the state (either from high to low or from low to high) of this signal generates a 
modem status interrupt. The modem or data set asserts this signal to indicate that it has 
detected a telephone ring. This will cause the 82360SL to wake the 386 SL CPU froma 


suspended state if modem ring is enabled as a wake-up event. 


basket B)RTS# | Request To Send: This active LOW output signal informs the modem or data set that the 
banshee oo Port Controller is ready to send data. 


COM(A,B)TXD Serial data transmission: This output signal is used to transmit data serially between the 
Serial Port Controller and serial device. Each character can consist of five to eight bits of 
data with one start bit and either one, one and a half, or two stop bits. The least 
significant bit is transmitted first. The control of the format of a character is defined under 
S/W control via the Line Control Register. Please consult the 386 SL Microprocessor 


SuperSet Programmer’s Reference Manual for additional information. Information 
regarding the functional timing specifications of transmitted and recieved serial data may 
be found in sections 6 and 7 (A.C. timing specifications and timing diagrams). 


COMX1,COMX2 | Crystal oscillator input and output pins: The crystal attached to these signals should 
be tuned to 1.8432 Mhz. The on-chip oscillator uses an external crystal and tank circuit to 
generate an internal clock. This clock is used to generate the various baud rates for the 
serial ports. Optionally an external oscillator may be connected to the COMX‘1 input. 


CPURESET CPU RESET: This active HIGH output is connected directly to the 386 SL CPU to provide 
a reset of the 386 CPU core. CPURESET always occurs during a PWRGOOD reset. 


CPURESET may also be generated by RC# from a keyboard controller, Fast Reset from 
CX1,CX2 


1/O Port 92 or other programmable Reset, or a resume from suspend. 
| DACKI7:5], 


[3:0] # 


DMA8/16# 


DRQI[7:5], 
[3:0] 


Crystal oscillator input and output pins: The crystal should be tuned to 14.31818 Mhz. 
It is used for the ISA bus signal OSC signal and is internally divided by 12 to clock the 
timer counters. The oscillator input may be directly driven from an external source. 


DMA ACKnowledge channel n (ISA bus signal): The 82360SL DMA controller drives 
the respective DMA acknowledge signal low after a device has requested DMA service. 
The corresponding output signal indicates that the DMA channel transfer may begin. 


DMA 8-bit or 16-bit cycle: This output signal is directly connected to the 386 SL CPU. 
When the signal is HIGH it indicates that the current DMA cycle is 8-bit. When this Signal 
is low it indicates that the DMA cycle is using a 16-bit channel. 


DMA ReQuest channel n (ISA bus signal): These input signals are used to request 
DMA service from devices residing on the ISA bus. An ISA bus device drives this signal to 
request service from the appropriate DMA channel by asserting this signal high. 


ERROR # MCP ERROR: This signal is an active LOW input to the 82360SL. The math coprocessor 
| error signal generates a IRQ13 through the 82360SL. 


EXTSMI# EXTernal System Management Interrupt request This active low input will generate a | 
SMI request if the function is enabled. — 


EXTRTCAS EXTernal RTC Address Strobe: This output signal is active HIGH when there is a write 
access to the RTC I/O address port and when an external RTC is selected. 


EXTRTCDS EXTernal RTC read Data Strobe: This output signal is active LOW when there is a read 
access to an external RTC I/O data port and when an external RTC is selected. 


EXTRTCRW # EXTernal RTC (Real Time Clock) Read/Write: This low true output signal is active 
_| when there is a write access to an external RTC I/O data port and when an external RTC 


is selected. 
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82360SL ISA Peripheral I/O Signal pescdntons (Continued) _ 


Name and Function 


Pad oe beim ml me 
iS ine Chip SEISCt 


isPpy Chip Select: This LOW true output signa 


controller |/O ports O3FO—03F5 and 3F7 hex. 


HALT # HALT: This LOW true input signal is driven by the 386 SL CPU and indicates when the 
CPU has executed a HLT instruction (address = 2) orisina shutdown condition: 


(address = 0). 


_HD-bus Data bit HD7: The bi-directional System Data Bit 7 is controlled separately for 
the Integrated Drive Electronics (I.D.E.) hard disk drive and floppy disk drive. This is 
provided to accommodate the I/O address 3F7 hex which is split between the floppy disk 
drive controller and |.D.E. hard disk. Data transfer between storage peripherals 
connected to the I.D.E. Hard Disk and Floppy Disk and the 82360SL are on separate 
busses. Data bit 7 has to be separated from data bits [6:0]. The 82360SL controls and 
buffers data bit 7 seperately. 


Hard Disk Chip Select: These LOW true output signals are the |.D.E. hard disk drive chip | 
selects decoded from the I/O address ports 01FO-01F7h pe and OSF6—-03F7h 
— (HDCS1#). 


Hard Disk buffer ENable: These LOW true output signals control the |I.D.E. hard disk 
data buffers, high and low bytes. 


HoLD Acknowledge (direct to CPU): This HIGH true input signal indicates that the 386 
SL CPU has released the ISA bus for refresh, DMA or master cycles. 


| HDCS[1:0] # 


| HDEN(H,L) # 
HLDA > S18 


- Hold ReQuest (direct to CPU): This active HIGH output signal indicates a request to the 
386 SL CPU to release the ISA bus when the peceeor requests the bus for ISA bus style 
refresh, DMA or master mode cycles. 


_ This pin is multiplexed. It can oe used as Timer 2 gate 2 input or a speaker input from the 
modem. | ng 


INTerrupt Acknowledge (direct to CPU): This active LOW input to the 82360SL 
indicates that the 386 SL CPU has recognized an interrupt and will initiate an interrupt 
acknowledge bus cycle. The INTA bus cycle is comprised of two eight-bit |/O cycles in 
' which the interrupt vector transferred on the second eight-bit |/O write of the INTA cycle. 


| INTR 7 INTerrupt Request (direct to CPU): This active HIGH output requests a standard 
ae. maskablie interrupt to the 386 SL CPU. 7 
- {OCHCK# 10 CHannei ChecK (ISA bus signal): This maskable active LOW input is driven by a 
os | device on the ISA bus: typically used to indicate a parity error on the.ISA bus. This signal 
is one of the possible sources which may generate an NMI. NMI generation via |O 


Channel Check may be enabled or disabled using PORT 61 (IOCKEN). NMI may be 
masked using the ISA bus compatible NMI control port at I/O 70 hex bit 7. 


lIOCHRDY 1/O CHannel ReaDY (ISA bus signal): This active HIGH input is used by the 82360SL 

_DMA controller to extend ISA bus cycles. IOCHRDY is also used to extend bus cycles for 

| 1/O device trapping. Additional wait states extend the bus cycle, allowing for start up 
during Resume mode. The ISA bus is a normally ready bus, an external device can 

| extend a DMA cycle or ISA bus cycle by Sern this signal cativert low). This signal is 

normally dh on the ISA bus. 
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82360SL ISA Peripheral I/O Signal Descriptions (Continued) 


Name and Function 


1IOCS16 # 16-bit 1/O Chip Select (ISA bus signal): This active LOW input signal to the 82360SL is 
used to indicate a 16-bit 1/O bus cycle. The !.D.E. hard disk high byte buffer enable is | 
generated when IOCS16 # is driven low during an |.D.E. 16-bit |/O access. |OCS16# is 


also an input to the 386 SL CPU driven by devices residing on the ISA bus to indicate a 
16-bit 1/O bus cycle. 


lIOR# 1/O Read (ISA bus signal): This bi-directional active LOW signal is an input during ~ 
| normal accesses to |/O ports. When low this signal indicates an !/O read. This signal is 
an output from the 82360SL during DMA bus cycles for |/O to memory transfers. 
lOW # I/O Write (ISA bus signal): This bi-directional active LOW signal is an input during 
normal accesses to I/O ports. When low this signal indicates and |/O write. This signal is. 
an output from the 82360SL during DMA bus cycles for memory to I/O transfers. 


IRQ(15, 14, Interrupt ReQuest n (ISA bus signal): These active HIGH input signals are used to 
12-3, 1] request interrupt service. The interrupt request lines are driven by devices on the ISA bus 


which have a Corresponding interrupt service routine associated with the interrupt vector 
and interrupt request. 


KBDA20 KeyBoarD A20 gate: This active HIGH input is “ORed” with internal bits to produce 
A20GATE which goes to the 386 SL CPU. The bit is connected to port 2, bit 1 of an 8042 
in a standard ISA bus compatible system. 


KBDCLK KeyBoarD CLockK: This output signal is used to drive the clock input to the keyboard 
controller. It is derived from the 8 MHz SYSCLK and can be divided by 1, 2, 4 or stopped. 


LA[23:17] Local Address bus (ISA bus signal): These are input signals to the 82360SL during 
memory transfers (decoding for X-bus buffer controls) and output signals during DMA 


accesses and refresh. The latchable address lines allow access to physical memory on 
the ISA bus to 16 megabytes. 


LPTACK# Line PrinTer ACKnowledge: Active LOW input signal which is part of the parallel port 
data handshake. The line printer asserts this signal to show that data transfer was 
complete and that it is ready for the next transfer. | 


LPTAFD# Line Printer Auto line FeeD: This signal is an active LOW output from 82360SL to a 
printer. When asserted, it instructs the printing device to insert a line feed at the end of 


every line. 


LPTBUSY Line PrinTer BUSY: This signal is an active HIGH input to 82360SL. The printer asserts 
| this signal when it is not ready to accept further data from 82360SL. 


Line printer Data bus: These signals are the 8-bit bi-directional data bus for the parallel 
port. In PC/AT mode these signals are output only. The 82360SL also supports a 
bidirectional mode for the PS/2 style parallel port. 


LPTDIR Line PrinTer DiRection: This active HIGH output signal is only valid in bidirectional 
| mode for data transfer using the parallel port. — 


LPTERROR# | Line PrinTer ERROR: This active LOW input signal is driven by a peripheral device to 
flag an error condition. 


- LPTINIT # Line PrinTer InITialize: This active LOW output from 82360SL instructs the peripheral to 


LPTD[7:0] 


initialize itself. | 
LPTPE Line PrinTer Paper End: This active HIGH input to 82360SL signals that the printer has — 


run out of paper when asserted. 
LPTSLCT 


Line PrinTer SeLeCTed: This active HIGH input signal is asserted by the printer to 
confirm that it has been selected. 
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82360SL ISA Peripheral I/O Signal Descriptions (Continued) 


Name and Function 


LPTSLCTIN# Line DrinTer Sel eCT IN: This 


LPTSTROBE # 


interfaced to the parallel port. 
MASTER # 
MEMR # 


Line PrinTer STROBE: This active LOW output signal is used to strobe data into the 
peripheral device. The parallel port controls are read and written through I/O registers. 


ISA bus MASTER (ISA bus signal): This active LOW input signal is used with a DRQ line 
to gain control of the system bus. Upon receiving DACK# the 82360SL may pull 
MASTER # active (low), which will allow the 82360SL control of the system address, data 
and control busses. The 386 SL CPU will have tri- Stated these lines one clock after 
receiving the MASTER # signal. | 


MEMory cycle Read (ISA bus signal): This bi-directional active LOW signal indicates a 
read cycle anywhere in the 16 Mbyte memory address space. During memory read cycles 
to memory on the ISA bus, this signal is an. ia into the 82360SL. MEMR # is driven by 

the 82360SL during DMA cycles. 


MEMory cycle Write (ISA bus signal): This bi- directional active LOW signal indicates a 
write cycle anywhere in the 16 Mbyte memory address space. During memory write 
cycles to memory on the ISA bus, this signal i is an input. MEMW # is an output from the © 
82360SL during DMA cycles. — | 


No Connection: These signals must not be eonneciad to any Ee The No 
Connection signals must be left floating in order to guarantee proper operation of the 
82360SL and compatibility with future Intel processors. 


Non Maskable Interrupt (direct to CPU): This active HIGH output is directly connected 
to the 386 SL CPU. The 82360SL asserts NMI to request the 386 SL CPU to service a 

high priority non-maskable interrupt. The low to high transition of this signal is recognized 
by the 386 SL CPU. 


ON-board Circuit Emulation: This active LOW input pin floats the appropriate outputs of 
| the 82360SL as indicated in Section 2 pin assignments. When ONCE # is driven active 
the 82360SL allows an In-Circuit emulator (ICETM-386T SL) module to drive its signals. 
This allows the system to be tested while the 82360SL i is still physically populated on the 
motherboard. 


OSCillator (ISA bus signal): This is the 14.31 818 Mhz output signa with a 50% duty 
cycle and i is asynchronous to SYSCLK. _ 


Parity ERRor (direct from CPU): This active LOW input signal | is connected to the 
| output of the 386 SL CPU. When the 386 SL CPU detects a parity error from the local - 


DRAM subsystem it drives this signal to the 82360SL. The system memory parity error 
will generate a NMI via the 82360SL when NMI is enabled via I/O port 70 hex bit 7. 
SMRAMCS # 


System Management Interrupt (direct to CPU): This active LOW output is directly 
PWRGOOD- | 


connected to the 386 SL CPU. When the falling edge of SMI # is detected by the 386 SL 
CPU it generates the highest Priority interrupt when enabled. The typical use ou SMI # Is 
for power management. 


System Management RAM Chip Select: This active LOW output is driven whenever the 
386 SL CPU is accessing the System Management SM-RAM. It is active even when SM- 
RAM is part of the 386 SL CPU system memory RAM. The 82360SL uses the 
SMRAMCS # to determine when the SMI code is being executed on the ISA bus, and 
enables the X-bus control signals. 


PoWeR GOOD: This active HIGH input is typically supplied by the power supply. When 
Power good is activated high this indicates that the supply voltage is stable. Power Good 
low is also used to generate System Reset, RESETDRV, and CPURESET. 
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82360SL ISA Peripheral I/O Signal Descriptions (Continued) 


Symbol Name and Function 


RC# Reset CPU: This active low input is typically driven by the keyboard controller. RC # is 
“ORed” with internal bits to produce a programmable pulse width CPURESET signal. It is 
connected to port 2, bit O of an 8042 in a standard ISA bus compatible system. 


REFREQ REFresh REQuest (direct to CPU): This active HIGH output signal is directly connected 
to the 386 SL CPU. When Refresh Request is asserted it indicates that the 386 SL CPU 
should refresh the local DRAM subsystem. 


REFRESH # System REFRESH (ISA bus signal): This active LOW input signal indicates a refresh 
cycle. It is driven for the duration of the cycle. lt is an input during master generated 
refresh bus cycles. 


RESETDRV RESET DRiVe (ISA bus signal): This active HIGH output is the main system cold reset, 
generated from the power supply “power good” signal and by system resume. 


RTCEN # RTC ENable: This active LOW input signal should be strapped high or low depending on 
whether an internal (LOW) or external (HIGH) RTC is used in the system. The 82360SL 
on-chip real time clock and CMOS RAM are enabled by this signal when LOW. 


RTCRESET # Internal RTC RESET input: This active LOW input signal is used to reset the internal 
RTC status and flag registers, (typically when the RTC battery has been changed). 


RTCVCC This is a separate power supply input for the internal RTC. It should be connected to a 3V 
battery when the system is fully off and 5V during active operation. 


RTCX1,RTCX2 | RTC Crystal oscillator input and output pins: The crystal should be tuned to 
32.768 Khz. It is used for the RTC and system power management state machines. The 
oscillator may be driven directly from the input signai. 


SA[16:0] System Address bus (ISA bus signal): The bi-directional system address bus is an 
input for decoding internal |/O registers and an output during DMA and refresh cycles. 

SBHE # _| System Byte High Enable (ISA bus signal): The active LOW output signal indicates 
when there is valid data on the upper data byte of the system data bus. 

SD[7:0] System Data bus (ISA bus signal): This is the bidirectional system data bus. The 


82360SL directly drives the ISA bus system data bits [7:0] without external transceivers - 
or buffers. 8-bit data is transferred to and from the 82360SL with these signals. 


SMEMR # System MEMory Read (ISA bus signal): This signal is driven by the 82360SL to signify 
a memory read cycle to the bottom 1 Mbyte adcress range. It is used by ISA bus 


in a ne Ne ere 


SMEMW # System MEMory Write (ISA bus signal): This signal is driven by the 82360SL to signify 
memory write cycle to the bottom 1 Mbyte address range. it is used by ISA bus 


SMOUT[5:0] System Management OUTput control: These six outputs can be connected to control 
the power circuits for various devices in the system. These output pins are directly 
controlled by the SM__OFF__CNTRL register. 


SPKR SPeaKeR output: This is: the output of the 6254 megaceli, timer/counter #1, channel 2, 
or directly driven through IMUX0O, or from the 8254 megaceil, timer counter #2, channel 
1. This output signal is typically connected to an external speaker. There is additional 
Circuitry to ensure that the signal is low when not being used. 


(ee ae eA Cou tn ee 


SRBTN# Suspend/Resume BuTtoN: This active LOW input generates a SM! requesting a system 
suspend or resume. | 
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82360SL ISA Peripheral I/O Signal Descriptions (Continued) — 


| Symbol : Name and Function ~ 7 


STPCLK# SToP Clock: This activa | OW o fey utout ann 


we AVEO 4 vigil ea 
SYSCLK i 
SUSpend STATus: The 82360SL power management controls this active low output 


SUS__STAT # 
signal to switch the power off to all non-critical devices during a suspend. 


Terminal Count (ISA bus signal): This active HIGH output signal is used to indicate the 
termination of a DMA transfer. 


TIM2CLK2 TiMer 2 CLK: This is the input clock for timer/counter #2 when it is programmed to be 
used in the General Purpose (GP) mode. 


TIM2O0UT2 TIMer 2 OUTput: This signal is the frequency output from timer/Counter #2 and can be 


~ ha Ff f~ 1 a ee ee ek 
iS clock io ine 386 C CPU cove of tn 


386 SL Microprocessor. Stop clock is pate conn cted to the 386 SL CPU from the 
82360SL. The 82360SL activates this signal upon jetecicn of a halt bus cycle or when 
an I/O read to the stop clock register in the 82360SL occurs. 


SYStem CLocK (ISA bus signal): This signal is an output from the 386 SL CPU and an 
input to the 82360SL. The SYSCLK signal is used to clock the ISA bus state machines 

and is also used to derive the internal DMA clock signal in the 82360SL. The SYSCLK is 
the 8 MHz typical clock which is one half of the frequency of ISACLK2. 


used as a general purpose timer/counter output. 


Voc System Power: Provides the + 5V nominal D.C. supply inputs for the 82360SL. 


System Ground: Provides the OV connection from which all inputs and outputs are 
referenced. 


X-bus Data bit XD7: |/O port 3F7h is split between the floppy and hard disk and the 
storage peripherals which transfer data reside on separate busses. Data bit XD7 is 
separated from bits XD[6:0]. The 82360SL separately controls and buffers bit XD7 to 
isolate data bit 7 from the floppy disk and |.D.E. hard disk. : 


X-bus Data ENable: This active LOW output signal is used to control the X-bus data 
transceiver. It is only activated by the 82360SL on valid accesses to X-bus peripherals. 


X-bus data DiRection: This active HIGH output signal controls the direction of the X-bus 
and HD-bus data transceivers. XDIR is high for read cycles. 


ZERO Wait State (ISA-bus signal): This active LOW output signal is driven by the 
82360SL when it can accept a zero wait state write cycle. 


ZEROWS# | 
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4.0 PACKAGE THERMAL 
SPECIFICATIONS 


The SL SuperSet is specified for functional opera- 
tion with a temperature range from 0 to 90 degrees 
Celcius for the 386 SL CPU and the 82360SL. The 
case temperature should be measured in the operat- 
ing environment to determine whether the SL Super- 
Set is within the specified operating temperature 
range. The case temperature should be measured at 
the center of the top surface of the package. When 
the SL SuperSet devices have a supply voltage ap- 
plied the operating temperature range is applicable 
rather than the storage temperature. 


The following definitions and assumptions are used 
to determine the recommended maximum case tem- 
perature for the 386 SL CPU and 82360SL: 


Ta = Ambient Temperature in degrees Celcius 
Tc = Case temperature in degrees Celcius 


0jc = Package thermal resistance between junc- 
tion and case 


= Package thermal resistance between junc- 
tion and ambient 


Ty = Junction Temperature 
P = Power Consumption in Watts 


D> 
é 
> 

| 


The ambient temperature can be evaluated by using 
the values of thermal resistance between junction 
and case, 9jc and the thermal resistance between 
junction and ambient 6 , in the following equations: 


Ty = To + P*@jc 
Ta = Ty — P*Oja 
To = Ta + P*lOya—8 5c] 


I 


Values for 0) and 6 jc are given in Table 4-1 for the 
196-lead PQFP 82360SL and the 227-lead LGA 
3861 SL CPU. 


Table 4-1. Thermal Resistances (°C/W) @jc and Oya 


65a (°C/W) versus Airflow—ft/ min (m/sec) 


Package Oyo °C/W 0 200 | 400 | ~ 600 
a) (4.01) (2.03) (3.04) 


196L PQFP 


800 1000 
(4.06) (5.07) 


5 21 18 13.5 11.8 | 10.5 9.5 
227L LGA | | 
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ABSOLUTE M AXIMUM RATINGS | en to avoid high static voltages and electric fields to 


| prevent static electric discharge. 
Table 4.3 provides environmental stress ratings for 


the packaged SL SuperSet devices. Functional op- Other system components such as the memory sub- 
eration at the storage maximum and minimum rat- system (DRAM/SRAM), storage peripherals (hard 
ings is not implied or guaranteed. disk/floppy disk), |1/O and display subsystem may 


} - reduce the absolute maximum storage temperature 
Extended exposure to maximum ratings may affect conditions due to the inherent physical characteris- 
device reliability. Further, precautions should be tak- tics of the other components. 


7 Table 4-3. Maximum Ratings _ —_ _—? 


[i Storage Temperature ——OSCSC*~dtC*‘“~— HOH TCT 
2. case Tenperaturoundersias——id~SCSC Wo ero) —_—*d| 


NOTE: ; Sf . 

1. Case temperature under Bias maximum rating also includes the case where the 386 SL CPU and 
82360SL are in suspend or standby mode. In standby mode and in specific cases in suspend 
mode, power is applied to the SL SuperSet for operation of the Real-Time Clock and DRAM re- 
fresh. It is assumed in these cases that the SL SuperSet devices are not in normal or full-speed 
operation. Typically at these extreme minimum and maximum temperature ranges the external os- 
cillators are stopped or diasbled with the exception of the 32 kHz Real-Time Clock oscillator. The 
limiting factor for minimum and maximum case temperature under Bias is the operational tempera- 
ture range supported by the RTC crystal and 82360SL on-chip oscillator: It is also assumed that 
main system memory is not being accessed (only slow refresh for DRAM) or the SRAM is in stand- 
by mode, and all other components used in the system are also capable of operating at these 
maximum and minimum temperature values. 


5.0 D.C. SPECIFICATIONS 


386T™ SL CPU D.C. Specifications 
Functional operating range: Vcc = 5V + 10%; TCASE = 0°C to 90°C 


| Table 5-1. D.C. Voltage Specifications | 


[Notes 
PV Inpt Low Voltage At 8 MHz 
Input High Voltage At 8 MHz 


C 6 3 
Ge V1 At8 MHz, | 
| CMOS Logic Levels | 
: V. At 8 MHz, 
CMOS Logic Levels 


Output Low Voltage 


= 24m At 8 MHz(1) 
dol Pinas callie ee“ YG At 8 MHz(2) 
Output High Voltage &" © 


At 8 MHz(1) 
At 8 MHz(2) 
At 8 MHz(2) 
At 8 MHz(1) 


lon = —2mA 
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Table 5-2. Leakage Current and Sustaining Current Specifications 


Input Leakage Current 
Condition 1: When SUS__STAT # 
and/or ONCE # not active. 


Symbol 
NL 


IBHL nite 
IBHH Input Sustaini 
(Bus Hold Higt*>* 


Bus Hold Low Overdrive | (Notes 3, 6) | 
Bus Hold High Overdrive 


a 
Input Capacitance 


Pins with internal 60k PU Vit = 0.45V 
Pins with internal 20k PD Vin = 2.4V 
Pins with internal 300 PU Vit = 0.45V 


Other Input Pins 


Condition 2: When SUS__STAT # 
and/or ONCE # active. . 

Pins with internal 60k PU 

Pins with internal 20k PD 

Pins with internal 300 PU 

Other Input Pins 


OV < Vin < Voc 


OV < Vin < Voc 
& OV < Vin < Voc 


OV < Vin < Voc 


Output Leakage Current 
Condition 1: When SUS__STAT # 
and/or ONCE # not active 
Pins with internal 60k PU 
Pins with internal 300 PU 
Other Output Pins 


Vout = 0.45V 
Vout = 0.45V 
0.45V < Vout < Vcc 


and/or ONCE #¥ active 
Pins with internal 60k R 


0.45V < Vout < Vcc 
0.45V < Vout < Vcc. 
— 0.45V < Vout < Voc 


Output or 1/O Capacitance 
EFI Capacitance 


NOTES: 

1. List of pins which have 24 mA/4 mA Io /loH specification, (reference section 2). 

2. Other output pins which do not belong to list in Note 1, (reference Section 2). 

3. Tested with CPU Clock stopped. 

4. This is the maximum current the bus hold circuit can sink without raising the node above 0.8V. Ip}, should be measured 
after lowering Vij to Ground (OV) and then raising to 0.8V. 

5. This is the maximum current the bus hold circuit can source without lowering the node voltage below 3.0V. IBHH should 
be measured after raising Vix, to Vcc and then lowering to 3.0V. 

6. An external driver must source at least IpHL_o to switch this node from low to high. 

7. An external driver must sink at least IBHHoO to switch this node from high to low. 

8. Not tested. Guaranteed by design characterization. 
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__ Table 5-4, 386™ SL CPU ee i 


Symbol 
loc (Note 1) 


"400 us | (Notes 2, 4) 
(Notes 3, 4) 


ae (Notes 5, 6)_ 


“A ey + 

| | | : Mok hak oar i “ 

| 7 Stell a — | i 
Eincll 9 


Supply ae Mode er 
OFF Running/ Suspqg@ Sfresh OFF 


Supply Current 
Minimum Configuration 
Maximum Configuration 


loc 


Ioce2 
loc3 


NOTES: 

1. Tested at EF! and ISACLK2 at maximum frequency, with 50 pF load and no resistive loads on the outputs. 

2. Minimum System Configuration consists of 1 bank of 1 Megabyte x 4 DRAMs (2 Megabyte total memory), cache disabled 
with no cache SRAM, 25 pF capacitive loading on the Pl-bus control/status signals, 100 pF capacitive loading on the ISA- 
bus, 100 pF loading on the SYSCLK. 

3. Maximum System Configuration consists of 4 banks of 4 Megabyte x 1 DRAMs (82 Megabytes total), cache enabled with 
2 x (16k x 16) cache SRAMs, 65 pF capacitive loading on the P!-bus control/status signals, 300 pF capacitive loding (8 
slots) on the ISA-bus and 300 pF capacitive loading on the SYSCLK signal. 

4. Not tested, very conservative estimates provided from engineering analysis at worst case temperature and at 5.5V with 
the described system configuration for comparison only. 

5. Characterized with Voc = 5.5V, EFl = 40 MHz, ISACLK2 = 16 MHz 

6. 412.5 mW with 386 SL CPU with Stop Clock, all external oscillators are free running, there are no active bus cycles on the 
Cache, Memory or ISA busses. Internal logic such as the Cache and Memory Controller are unaffected by stopped or slow 
clock and continue to consume the fixed power represented in Ico}. 

7. 55 mW with 386 SL CPU in suspend mode, all external oscillators are free running, there are no active bus cycles on the 
Cache, Memory or ISA busses except suspend refresh. . 

8. 33 mW with 386 SL CPU in suspend mode, all external oscillators are off (fixed Logic State), there are no active bus 
cycles on the Cache, Memory or ISA busses except suspend refresh. 

9. 27.5 mW with 386 SL CPU in suspend mode, all external oscillators are off (fixed Logic State), there are no active bus 
cycles on the Cache, Memory or ISA busses including suspend refresh. 
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386™ SL CPU Icc Specifications: 
Special Topics 


DETERMINING Icc WITH SLOW CLOCK 
CONTROL 


The 386 SL CPU supports CPU clock division which 
reduces power consumption of the CPU core logic. 
The EFI clock input is similar to the CLK2 input 
found on the 386 CPU. However, the _ internal 
CPUCLK signal in the 386 SL CPU is not always one 
half of the frequency of the EFI (CLK2) input. An 
internal clock divider and synchronizer allows the 
CPU core clock to be slowed down and even 
stopped. However, additional internal logic such as 
the memory controller and cache controller continue 
to use half the EFI frequency. Therefore, when cal- 
culating the theoretical power consumption with 
CPU clock division it is important to recognize that a 
fixed constant (K) value of power is required by the 
386 SL CPU. 


The value K is constant only if the ISA bus loading IS. . 


constant. Figure 5-1 shows the value of K for diff 
ent values of ISA bus capacitance. \ 


Icc(divided clock) = [icc(normal clock) : 


(e.g., divide by 2 =,,0.5% ‘f 


K = Is a constant in MilliAmpg¢ 
reading the value in Figuré 


To determine the maximum current for the 386 SL 
CPU with CLK2 divider perform the following steps: 


1. Multiply the Icc of the normal minimum system 
configuration by the fractional value of the clock 
divider. | 


2. Sum the total capacitive load of ail active ISA bus 
output signals from the 386 SL CPU to all devices. 


3. From Figure 5-1 draw a line from the horizontal 
axis (Capacitance) where it intersects the diagonal 
line. 


4. From Figure 5-1 draw a perpendicular line to the 
vertical axis to determine K. 


5. Solve the equation for loc (divided clock). 


4237.5 7062.5 9887.5 


: 2825.0 5650.0 8475.0 11,300.0 


Total ISA Bus Capacitance (pF) 
240814-5 


Bet. Variation of the constant current (K) 
h respect to the total ISA bus capacitance 


‘Io¢c¢ WITH STOPPED CLOCK 


Table 5-3. icc Static 


locs Supply Current 
_ | (static) 


NOTE: 

1. Tested while clock stopped in PH2 and inputs at Vcc or 
Vss with the outputs unloaded. Clock stopped after |O 
Read at address 25H. EFI and ISACLK2 inputs should be 
at Vcc or Vss. 
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POWER VARIATIONS WITH CAPACITIVE LOADS AT VARIOUS VOLTAGES 


FULLY LOADED ISA=BUS SLOT 


POWER CONSUMPTION (mW) 


CAPACITIVE LOAD (pF) . 240814-6 


Figure 5-2. ISA Bus 


NUMBER OF DRAM MEMORY DEVI 
6. 

oP 55V, 20 MHz 
= 5.0V, 20 MHz 
450.0 ‘ASV, 20 MHz 
360.0 


180.0 


© 
3 


POWER CONSUMPTION (mW) 
N 
x 
2 
ro) 
°o 


" 1475 1770 2065 2360 


LOAD (PF) 24081 4-7 


5.5V, 20 MHz 
S.0V, 20 MHz 


4.5V, 20 MHz 


POWER CONSUMPTION (mW) 


1180 1475 1770 2065 2360 
CAPACITIVE LOAD (pF) 
240814-8 


Figure 5-3b. Memory Bus with Cache 
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Calculation of Icc for Various SL SuperSet 
System Configurations 


A set of three curves with Vcc at 4.5V, 5V and 5.5V 
are plotted in Figure 5-2. Figure 5-2 illustrates the 
power consumption in milliWatts with respect to the 
capacitive loading on the ISA bus signals of the 
386 SL CPU. The CPUCLK is assumed to be 
20 MHz and EFI input is 40 MHz. A similar set of 
curves are provided for the memory bus without a 
cache subsystem in Figure 5-3a. The power con- 
sumption with respect to load capacitance for the 
memory bus with a cache subsystem is illustrated in 
Figure 5-3b. To find the Power (P in milliWatts) of the 
386 SL CPU for the configuration of your system, 
use the following method. | 


1. Prepare a configuration list for your system includ- 
ing how many ISA-bus connectors, how many 
memory chips will be provided and: whether a 
cache will be connected or not. | 


2. From the curves in Figure 5-2, use the voltage of 
your system and the total capactive load of all of 
the 386 SL CPU ISA signals to find the power 
consumed by the ISA-bus interface. 


3. If a cache is connected to the 386 SL CPU in your 
system, use Figure 5-3b to find memory bus pow- 
er. If cache is not connected, use Figure 5-3a. 


4. Find the internal power consumption of the 
386 SL CPU from Table 5-4 and the cache inter- 
nal power and cache bus power from Tables 5-5 
and 5-6. — 


5. For a system with no cache, add the ISA-bus in- 
terface power, the memory bus interface power 
without cache and the internal power. This gives 
the power consumption of the 386 SL CPU with- 
out cache. 


6. For a system with cache, add the ISA bus inter- 
face power, the memory interface power with 
cache, the cache internal power, the cache bus 
interface power and the internal power. This gives 
the power consumption of the 386 SL CPU with 
cache. 3 


386™ SL MICROPROCESSOR SuperSet | 


ADVANCE INFORMATION 


Table 5-4. Internal Power 


Frequency 
(MHz) 


Table 5-6. Cache Internal Power 


Frequency 
(MHz) 


As an example, the power consumed by the 386 SL 
CPU when it is used in a 20 MHz system with 8 
memory chips and 2 fully loaded ISA bus expansion 
slots will be calculated. The system voltage is as- 
sumed to be 5V. 


From Figure 5-2, the power consumed by the ISA 
expansion bus interface is found to be 15 mW (the 
total capacitance of all the pins of a fully loaded AT- 
bus slot is 1396.25 pF). For a system with no cache, 
the power consumed by the memory bus for 8 chips 
is about 140 mW from Figure 5-3a. The internal pow- 
er at 20 MHz is 1758.0 mW from Table 5-4. The 
power consumed by 386 SL CPU is the sum of the 
power for the internal power (ISA bus and CPU core) 
and memory bus. The total power consumed by the 
386 SL CPU for this system is 1913 mW. 


For a system with cache, the ISA bus interface pow- 
er is 15 mW as previously determined. The memory 
bus interface power is determined from Figure 5-3b 
is found to be 60 mW. The internal power remains 
1758.0 mW. The cache bus power is read off from 
Table 5-5 to be 30.5 mW and the cache internal 
power from table 5.6 is 650 mW. Hence, in this 
system, the 386 SL CPU consumes a total of 
2513.5 mW. 
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82360SL D.C. Specifications - 
Functional operating range: Vcc = 5.0V +10%, Tcase = O0°C to 90°C. 


- Tabie 5-7. 82360SL D.C. Specifications 


Output Low Voltage 


3 | lo. = 24 ma(4) 
Output High Voltage 


log = —3.3 mA(4) 


| ov | Io. = 8 mA) 
| ov | lon = -2ma®) 


V 
Vv 


lo. = 24 mA) 
lo. = 12 mA(6) 
lo. = 16 mA(11) 


lo = 12 ma”) 
V 


D.C. Specifications for Power-Down Mode 


IBATT Battery Supply Current 100 pA VBaTT = 5V 
a | . pA VBaTT = 3.0V(8) 


NOTES: 

1. No pullup or pulldown. 

2. For inpbuts—COMX1, CX1, RTCX1 

3. For outputs—LPTD7:0 . 

4. For outputs—OSC, AEN, SA16:0, LA23:17, MEMR#, MEMW#, IOR#, IOW#, SMEMW#, SMEMR#, SBHE#, TC, 
$D7:0, XD7, HD7, RESETDRV. : 

5. OWS #, IOCHRDY, REFRESH#. 

6. LPTSTROBE#, LPTAFD, LPTINIT#, LPTSLCTIN#, LPTDIR. 

7. For all other outputs of the module. | 

8. Measured at Vcc = OV, Veatt = 3.0V, 32 kHz RTC clock with input rise time and fall time, t; = tp < 50 ns. 

9. RTC clock at 32 kHz; Timer Clock, Serial clock and SYSCLK stopped; Vcc = 5.5V and RTCVCC = 5.5V, C; = 50 pF 
with outputs unloaded. . 

10. Ico tests at maximum frequency with no resisitive loads on the outputs. 

11. REFRESH # 
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6.0 SL SuperSet TIMING SPECIFICATIONS 


386 SL CPU A.C. Specifications 
Symbol Alt Symbol 
General: 20 MHz 


Ct 102a 
Ct 103a 


Parameter 


EFI Period 

EFI High Time at 2V 
EFI High Time at 3.7V 
EFI Low Time at 2V 


Tre] 


Ct 103b Qt3b | EFI Low Time at 0.8V ns 
Ct 104 Qt4 EFI Fall Time from 
(Vcc — 0.8V) to 0.8V ns 


EFI Rise Time 
0.8V to (Vcc — 0.8V) 


PWRGOOD Minimum Puls 
PWRGOOD Setup to [ ef . 


Pct | 

Ctitta 
eerie | 
| ct112b | 
) ct113a_ 
| ctit4a 
| Ct tt4b 


QNT3 


aaah 
q 


1 EFI 


QNT3 


EFI 
QNT3 


NO 
i) 


ep ‘Le “EZ iy a 
“es ie oe Ls 
ey Z 1p ep af 
ye ee Le SKE 
a “ay Z 
A tty Uff HE Fs 
j Ye, Y G y 
| (le: yy , Za 
: & $54 6s See ara 
: te, tye \y ZA, ih ty LYE 
MY A Jee YY 
dg See Y 
ty Ye Bs Ys 
ig Z 
oe Cin tip 
oh awh Gyre ips 
Lor Zi ae. ; Oi 
x 2 tf BB op eR 
YL Shih LEE 
GHEE Sb Bg i 
OD Cb ee 4x 
l” $s f wh, 
Uy ap 
SE +h 
we: 
ge 
"| Logs 
g OS tie HA 
Ye, 
Gon. 


S"STAT # Setup to EFI 


| Ct114> | Qte4b g5S__STAT# Hold Time | 
Cti15 oe ONCE # Minimumimum Pulse Width 35 
Ct115b | Qt25b | ONCE# Hold Time 


Ct116a Nt2a SMI# Setup to EFI 


Ct 116b | Nie SMI# Hold Time a1 | 


Ct117a Xtla INTR Setup to EFI 
Ct117b | Xtlb INTR Hold Time 


Ct118a Xt2a NMI Setup to EFI 


Ct 118b Xt2b | NMI Hold Time 


NOTES: 

QNT1. EFI maximum period is specified only for the case where a MCP (Math co-processor) is present in the system. 
NPXCLK period, high and low time are tested at 2V. All other parameters are guaranteed by design characterization. 

QNT3. A2Z0GATE, CPURESET, INTR, NMI, ONCE#, PWRGOOD, SMI#, STPCLK# and SUS__STAT# are asynchronous 
inputs to the 386 SL CPU. Setup and hold times with respect to the EFI input are provided for test purposes only. The 
minimum setup and hold times are specified for valid recognition at a specific clock edge. The minimum valid pulse width 
can be extrapolated from the setup and hold times with respect to EFI. 


NO 
oO 


QNT3 


QNT3 


QNT3 


—b | a ob 
on} oO oO 


pre 
oi 


eS 
Qo | —- 


QNT3 


aN 
| O1 


QNT3 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


@ 3 


pe Oe 


386 SL CPU ALC. Specificai | 

[-symbot | aitsymbol | Parameter | Min 
ISA-Bus Clock Timings 
Ct 201 
Ct 202 
Ct 203 


Ct 204 


Qts2 ISACLK2 High Time at 2V. 
Qt33_ ISACLK2 Low Time at 2V. | 28 | 32.5 


Qt34 ISACLK2 Fall Time from : 
| (Vcc — 0.8V) to 0.8V 


Qt3s5 ISACLK2 Rise Time from 
| 0.8V to (Voc — 0.8V) 


Qt36 ISACLK2 to SYSCLK Delay, . 
Falling to Rising Edge 


Qt41 SYSCLK Period — 
Qi42.—s |: SYSCLK High Time at 2V 
Qt43 SYSCLK Low Time at a6 
Qt44 SYSCLK Fall Time fre 


Ct 205 
Ct 206 


Ct 211 
Ct 212 
Ct 213 
Ct 214 


Ct215 | Qt45 


Ct272a | Nttla 
Ct272b | Nt1b 
ISA-Bus Timings 
Ct 221 
Ct222 | GB BAWE Inactive Delay from Tc phi 1 Low 


Ct 223 G9 LA17-—23 Valid Delay from Tc or Tc phi 2 Low 
10  Invali rc phi 


G7 1 ive Delay from Tg phi 2 Low — 


4 
% x 


Ct 224 G LA17-—23 Invalid Delay from Tc phi 2 Low 


NOTES: . 

~ QNT4. ISACLK2 minimum period, high and low times are specified with ISACLK2 input = 16 MHz and SYSCLK output = 
8 MHz. The ISACLK2 input specifications are provided to ensure that the SYSCLK output, period, minimum high and low 
time, rise and fall time and ISACLK2 to SYSCLK skew are met. . 

QNTS5. SYSCLK capacitive loading is 20 pF minimum and 120 pF maximum. SYSCLK period, low and high time are tested at 
1.5V thresholds. All other parameters are guaranteed by design characterization. — 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


Symbol[AitSymbol] Parameter _——~—~—~=«( Min [Max|Unit] Notes 


ISA-Bus Timings (Continued) 


Ct225 {G13 SA1-19 Valid Delay from Ts phi 2 Low 
Ct226 |G13a SA0-19, SBHE #, LA17-23 Valid Setup 
to phi 1 Low (External Master) 


Ct 227 SA1-19 Invalid Delay from Ts phi 2 Low 


St 


= 
Le) 
on; ai 
hm] oOo o>) 
=> 


Ct228 | G15 SAO, SBHE # Valid Delay from Ts phi 2 Low ~ 


SAO, SHBE# Float Delay from Ts phi 1 
aa MEMR#, MEMW# Active from Tc phi 1 Lowe. 


(16-bit Memory Cycles) 


'?) 
~ 
NO 
NO 
co 
G) 
—_ 
oO 


SN 


© 
= 
nN 
a) 
) 
G) 
ooh, 
Ni 


(External Master) 
HALT # Valid Delay from phi 


Command Inactive to Float (sec 
TI phi 1 Low (External MaSte 


QO;9 @) 
pe > 
no | i) 
w | a) 
a |r = 

—_h, _s ody 

Co) co |™~N 

o 


Pa 
= 
co 


‘@) 
oP 
N 
© 
BA 


Command Active Qetay'tr 45 


| dWpactives {Rm Teoc phi 1 Low 
Vl W # and HALT #) 


Ges 
G25 Setup to Tc phi 2 Low © 
[o27 | ZEROWS® SetuptoTophi2low 
[azo | ZEROWS* Hold rom Tophi2Low 


no 
re) 
oi 
G) 
nO 
>) 


5 


” 


nN 
a 
0 
G) 
nN 
a 


NT6, NT12 
NT6, NT12 : 


Hold from Tc phi 1 Low 


ro) 
G) 
Nn 
oO 


NT7 


NT7, NT9 
NT7, NT9 


aed 
rs [Wate Ole 


m}rm] nr 
BIR] Oo 
— © 
G) 
nD 
fi 


NO 
a 
ine) 
G) 
Nh 
a | 


: 
=~ 
No 
aN 
is 
G) 
Nh 
i<e) 


2, 
ND 
» 
oO 
G) 
nN 
io) 
aT) 


rep) - p1o + 


- caesar 
(External Master Cycles) 


SDO-15 Invalid Delay from Teoc phi 1 Low 


hi % g 
Ye Z 
4 i 
Lp oy 
74 yg Z by, yi 4 
“fy i, Vis 
fig hie 
Gy fs Ce bi 
BB 3 hi 
YE ey 
Pee 
ooh, —h oooh, Z 
0 
© (o) Nlo 4 ae 
a 


Oo | © 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


A.C, Specifications (Continued) = 3 - 

ISA-Bus Timings (Continued) - a | 
ct252 | G36 |OCHRDY Hold from Tc phi 2 Low fs | | ons | NT11 
|ct255 |G39_—_—_—| NMI/SMI# Setup to Tx phi 2 Low 


Ct256 | G4o | NMI/SMI# Hold from Tx phi 2 Low 
ct257 | G41 INTR Setup toTxphi2Low 


Ct261 | G45 HRQ Setup to Tc or Ti phi 2 Low 
Ct 262 HRQ Hold from Th phi 2 Low | 

Ct263 | G48a_ HLDA Active Delay from Th phi 1 Low 
Ct264 | G48b HLDA Inactive Delay from Th phi 1.1@ 
Ct265 | G49 DMA8/16# Setup to Th phi 2 Low 
ct266 | G50 MASTER # Setup to Th phi 
ct267 | G51 | ‘ 
ct268 | G53c | VGACS# Active D 
ct269 | Gs53d VGACS # Inactive! 


|Ct270 | G54a | ROMCSO 
from Ts phi 4 


Ct271 | G54b 


. 


Ct272 |G54c —S*|| RON 


ct273 | G54d ROMCSO #/CMUX14# Inactive Delay 


from LA[23:17] 


G55a_ —— | SMRAMCS # Active Delay from Ts phi 2 Low 


G55b SMRAWNCS ¥ Inactive Delay 
| from Ts or Ti phi 2 Low 


Ct271' | G56 TURBO Setup 


Ct 274 
Ct 275 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


[symbol ArtSymboi| ____—_—Parameter——~=~S*d in | as | Unt | Notew 


| ISA-Bus Timings (Continued) 


Ct 276 $D15-0 Valid Delay from IOCHRDY Asserted 48 ns 
(External Master) 
Ct 277 SD15-0 Data Invalid Delay from MEMR # Inactive | 7 ns 
(External Master) . 
Ct 278 SD15-0 Data Invalid Delay from IOR # Inactive 7 
(External Master) 
Ct 279 SD15-0 Data Setup to MEMW # Active 
(External Master) 
Ct 280 LC ! SD15-0 Data Hold from MEMW¢# Inactive —=_—if* 0 


(External Master) 


$D15-0 Setup to |OW # Active 
(External Master) 


BALE Active Delay from Th phi 1 Low <3 
(External Master) 


(Exterfjal Master) 


SS 


ma 
SA15-0 Hold after |OR# or |OW# Inactive 15 
(External Master) | 
1O0CS16# Active Delay from Valid Address 
(External Master) 

Ct 291 SD15-—0 Delay from IOR# Active 
| (External Master) Read from CPU I/O Ports) 

SD15-0 Valid Delay from phi 2 Low 
(External Master) Read from On Board Memory) 

— 

i 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


NOTES: | 

NT1. The ISA bus timings are specified in a synchronous manner with respect to the ISACLK2 input. ISACLK2 input is 
16 MHz, which is twice the frequency of the SYSCLK output. Each SYSCLK periodgepresents one T-state and each T- 
state corresponds to either the beginning of a bus cycle (Ts—Send Status)y,middle of a bus cycle (Tc—execute 
command), end of cycle (Teo¢), hold (Tp) or idle (T)). T-States, (Ts, To, Tega geet T)) comprised of two ISACLK2 
periods (Phi 1 and Phi 2). The ISACLK2 Periods or Phases, (Phi 1 a ; 4or rising edge are used to 
reference the synchronous ISA parameters. ISACLK2 Phi 1 falling edgé rising edge, ISACLK2 Phi 2 
falling edge leads SYSCLK falling edge. x 

NT2. Teoc represents the End of Cycle. The falling edge of ISACLK2 Phiteu : | 

NT3. After HLDA (Hold Acknowledge) is de-asserted, the 386 SL CPU %S ‘thea s bus with the previous address that 
was latched prior to the beginning of the HLDA cycle. The . ig’ Peters to this latched address. The latched 
address may or may not be vaiid for the next CPU bus cycle : the next CPU bus cycle on an external bus 
a valid address will be placed on the address bus. \ | 

N74. INTR, NMI, SMi#, and TURBO are asynchronous ig ACLK2 and SYSCLK. These are input 
signals to the.386 SL CPU. Setup and hold times WReS Ne « 2 input are provided for reference. The 
minimum setup and hold times are specified for v& assPacitic clock edge in other timing diagrams with 
the EFI clock input. & < 

NTS5. The setup time is required to ensure that by¥¢ 
device on an odd byte address boundaryg” ¢* 

NT6. MEMCS16# is sampled on the falling. age" 

NT7. 1|OCS16# and ZEROWS# are sar 

NT8. HALT timing is identical to a 16-Rit fe 
ed. Se. BS 

NT9. ZEROWS# and IOCHRDY ¢§ * "LOW during the same bus cycle. 

NT10. SDO-15 read data is sag t | SACLK2 Phi 2 at Teoc (End of Cycle). 

NT11. IOCHRDY de-asserted (LOV ampled on the falling edge of ISACLK2 Phi 2 when Command is active (LOW). 

De-asserting IOCHRDY # addgtipSserr 


« 


bus cycle except that no BALE or Status Signal is assert- 


NT12. ROM read bus cycles are Shake 
strapping pin ROM16/8# is sampled to determine if the ROM read is an 8-bit or 16-bit memory read. Additionally 
ROMCS0# and/or ROMCS1 # are asserted during a ROM read. . . 

NT13. DMA bus cycles are not supported to On-board |/O ports. AEN is HIGH during MASTER, DMA and access to the 
configuration registers. 

NT14. Byte swap timing for 8-bit DMA bus cycles is identical to that of an external master. 

NT15. During DMA cycles the 386 SL CPU drives SA17-19 with the value of LA17-—19 while HLDA is active. During other 
Slave cycles (i.e., Refresh and External Master) the 386 SL CPU does not drive SA17-19. 

NT16. During the INTA# cycle, SD8-15 should not change state. During the first INTA# pulse SDO-15 are ignored. The 

; second INTA# pulse in an INTA# bus cycle indicates a bus cycle that is similar to an 8-bit |/O read in which the 
interrupt vector is read from SDO-7. 

NT17. The 8259 INTA# minimum pulse width is 160 ns. 

NT18. ROMCSO#, ROMCS1# and SMRAMCS# are specified with respect to ISACLK2 when the CPU is the bus master. 

NT19. ROMCSO#, ROMCS1# and SMRAMCS# are specified with respect to valid address when an external master con- 
trols the bus. 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


‘[Sumboi]aitsymboi] _____Parameter_———~—~—~—=«diMin| Max] Unit] Notes 
PLBusTimings:20MHz 


Pi-Bus Timings: 20 MHz 
Ct 301 Min. Chip Select and Command Setup to 30 ns 
PSTART # Active | 
Ct 302 | Min. Chip Select and Command Hold 50 ns 
from PSTART # Active | 
Max. PRDY # Hold Time after PCMD# Inactive | | 72. 
Min. Read Data Setup Time to PCMD # Inactive 


| Ct 303 
Ct 304 
Ct 305 
Ct 306 
Ct 307 


Min. Read Data Hold Time from PCMD # Inactive 200 | 
Min. PRDY # Active Delay from PSTART # Active 


Maximum Write Data Valid Delay from 
PSTART # Active 


Min. Write Data Invalid Delay from 
PCMD # Inactive 


Min Address Setup Time to PSTART #% 
Min Address Hold Time from PST ART / 
PSTART # Pulse Width . 


Min Delay from PSTART 
PCMD# Active 


Ct 308 


Ct 309 
Ct 310 
Ct 311 
Ct 312 


~~ 
SHy 
< 
S 


Ct 313 


Ct 314 


> = | 


Ct 321 


ATCLK2 Sync. 
ATCLK2 Sync. 
ATCLK2 Sync. 
ATCLK2 Sync. 
ATCLK2 Sync. 
ATCLK2 Sync. 
ATCLK2 Sync. 


his 


tis 

tis 

t8s 
Ct327a 
ores [ts | SAlta@]VaidDely 
forsee [ws [SAlt79]vaidDeey | 

fies | tali72alVaidDewy | 
1328 [ts |SBHE#,SAOVald Delay | 
orsse [ts [S10t5|VaidDeay | 


Drs 


oO 
N 
B 


La Tri-Stated 
Lad 


Tri-Stated 
Tri-Stated 
Tri-Stated 
Tri-Stated 
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- 
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Nn 
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~- 
” 


5-787 


intel - 386™ SLMICROPROCESSOR Superset ADVANCE INFORMATION 


6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


ont [emi | rer [wn [wi [ vt | 


CPU Master (Continued) 


Ct 341 PW/R#, PM/IO#, VGACS# Valid Delay (Note y 
from EFI T1 phi 1 High 
Ct 342 SA[19:0], LA[23:17],; SBHE # Valid Delay - 
from EFI 11 phi 1 High F 
Ct343a PSTART # Active (LOW) Delay from EF N 
| T2 phi High 
Ct 343b 
Ct344a 


Ct344b | 
Ct 345a 


eee Active (LOW) Delg 
T2 phi 1 High 


NOTES: 

1. VGACS#, FLSHDCS#, PW/R#, PM/IO# and Addresses change for each subsequent read or write. 

2. PSTART # indicates a new cycle in which address, status and chip selects are valid before PSTART # is asserted Low. 
PRDY # terminates each bus cycle and a new PSTART # is driven if a new address and status signals are available. | 

3. EFl = 50 MHz, Internal CPU Phase CLK = 25 MHz. 

4. ISACLK2 = 16 MHz. 

5. Maximum parameters are based on worst case condition of Vcc = 4.2V, 120°C. 

6. Minimum parameters are based on best case condition of Vcc = 5.6V, 10°C. 

7..PRDY # setup worst case condition is —4 ns, 0 ns specified for test purposes. 


5:0] Valid Delay from EPI T2 ohi 2 High 
(CPU wile to PI Bus Slave a 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. eee eros (Continued) 


raat [| 
Foo | 


ite) 


Ct 421 Ht 1 
Ct 422 Ht12 


Ct 423 


Ct 424 


Potaaéa | Htiea | BUSY, PEREG, ERROR® Soup 


NOTES: 

QNT1. EFl maximum period is specified only for the case where a MCP (Math Co-processor) is present in the system. 
NPXCLK, period, high and low time at 2V are tested. Ail other parameters are guaranteed by design characterization. 

QNT2. NPXCLK, NPXRESET Loading: 30 pF. (Timing specified here is for in-system loading, Timing Spec with Tester Load- 
ing is TBD.) 

HNT1. CA Loading: min 10 pF, max 50 pF. 

HNT4. CD Loading: min 10 pF, max 35 pF. (Timing specified here is for in-system loading, Timing Spec with Tester Load- 
ing is TBD.) 

HNT5. NPXADS#, NPXW/R# Loading: 25 pF. (Timing specified here is for in-system loading, Timing Spec with Tester 
Loading is TBD.) 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


a eee 


386 SL CPU A.C. Specifications (Continued) 


r symbot | Aft Symbol 


Math Coprocessor Timings: 20 MHz (Continued) | 


ct441 | ati NPXCLK Period 
Ct442a NPXCLK High Time 2V | 6 | 


Parameter 


Qt12b NPXCLK High Time 3.7V. 
NPXCLK Low Time 2V | 


Ct443b | at13b NPXCLK Low Time 0.8V Tae 
ct444 | ati4 NPXCLK Fall Time 
(Voc — 0.8V) to 0.8V 
ct445 | atts NPXCLK Rise Time 
0.8V to (Vcc — 0.8V) | 
ct446 | at26 NPXCLK - Delay from RESET - NPXCLK 
SRAM Mode: 20 MHz Timings “ee 


7 tCSW 

| ctsos | tcsw 

[esos [usw 

WE@ABive Pulse Width 

Active Pulse Width 

0 
DS 
DS 
DH 
DH 


ZS . or . = 


Access Time from OE # 


Access Time from OE # 
CE # Setup to OE# Acti 
CE# Setup to OE# A 


3 Wait State 


await state | 


FESS 


ae ~ on} pa” 
ole o) O10 


t 
t 
t | ery Ti | 
t | i | 


Ct 517 htDH | Write Data Hold from WE # Inactive 
Ct 518 Write Data Hold from WE # Inactive 


io) 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


Fsymbot | aitsymboi | ___Parameter + win | Wax | Unt | Notes 
[SRAM Mode: 20H Timings Conimued) 


SRAM Mode: 20 MHz Timings (Continued) 


Ct 520 DIR Setup.to OE # Active 3 Wait State 


Ct 521 DEN # Hold from OE # Inactive 2 Wait State 


vot 
[DENS Hold from OE¢ nacive [0 | 


ct523 | tOSD 2 Wait State 
Ct 526 > 
ct 529 
Ct 535 | 
Ct537 ddr Hold from LE Inactive 
Ct 538 fpber Addr Hold from LE inactive 
Ct 541 Addr Valid Delay from LE Inactive 


3 Wait State 
2 Wait State 
3 Wait State 
2 Wait State 
3 Wait State 
2 Wait State 
3 Wait State 
2 Wait State 
3 Wait State 
2 Wait State 
3 Wait State 
2 Wait State 
3 Wait State 
ns .| 2 Wait State 
ns 3 Wait State 
2 Wait State 
ns 3 Wait State 
2 Wait State 
3 Wait State | 


Zs 
Z a 
iy ea 
ze Le 
wa Ys 


50 
70. 


i: 
Zs 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


Symbol | Alt Symbol Parameter | Min | Max | Unit | Notes | 
cteoz | tASR” F2 Mode 


Ct603 | tASR Row Addr Setup to RAS # Active ns P1 Mode 


Ct605 | tRAH Row Addr Hold from RAS# Active | F1 Mode 
cteos | tRAH Row Addr Hold from RAS # Active 


P1 Mode 

F1 Mode 
2Mode | 

P1 Mode 


F2 Mode 


; Ct611 =| tASC Col Addr Setup to CAS # Active 
cte13. | tCAH © Col Addr Hold from CAS # Active 
Ct614 | tCAH Col Addr Hold from CAS # Active - 
ctei5 | tCAH Col Addr Hold from CAS # Active, * 
Ct617 . | tRCD RAS# toCAS# Delay _g& 
Ct618 tRCD RAS # to CAS# Delay . ¢ 
ctéi9 | tRCD RAS # to CAS# Delay 
Ct621 tCSH CAS # Hold Time 
Ct622 ‘ine 


tCSH CAS# Hold, 
Ct 623 


tCSH 
cte25 | tRSH 
ct6ée26 | tRSH 
Ct627 | tRSH 
ctez9 | twos * up to CAS # Active (Write) 
Ct630 . 


tWCS | ‘¥ Setup to CAS # Active (Write) 
Ct631 


wos 
creae | WH 
Ct 635 , ‘ | 


—s —h —h 


nl a 
eam, 
= 
° 
os 
@ 


P1 Mode 
F1 Mode 
F2 Mode 
P1 Mode 
F1 Mode — 
F2 Mode 
7 Pt Mode 
F1 Mode | 
F2 Mode 
-P1 Mode 
F1 Mode 
F2 Mode 
P1 Mode 
F1 Mode 
F2 Mode 
P1 Mode 
F1 Mode 
F2 Mode 
P1 Mode 
F1 Mode 
F2 Mode 
P1 Mode 
F1 Mode 


an 
De) 
<= 
° 
jor 
o) 


wolwolala Si | 


20 
tWCH WE # Hold from, CAS# Active (Write) 
Ct 637 tRCS WE# Setup to CAS# Active (Read) 
Ct 638 tRCS WE # Setup to CAS# Active (Read) 
| cteés9 | tRCS | WE# Setup to CAS # Active (Read) 


cte41 | tRCH — | WE# Hold fromCAS# Inactive (Read) 


cte42 | tRCH | WE# Hold from CAS # Inactive (Read) 
Ct 643 tRCH WE# Hold from CAS # Inactive (Read) 


Ct645 | tWDS Write Data Setup to CAS # Active 
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386 SL CPU A.C. Specifications (Continued) 


Fsymbot | aitsymboi | ___Parameter ———~| win | Max | Unit | Notes 


oO 
Tl 


Ct 657 tCAC F1 Mode 
Ct 658 tCAC F2 Mode 


cté63 | tRDH P1 Mode | 
Ct 666 F2 Mode 
ctee7 | tRAS —s|§- RAS#A P1 Mode 

| Cte6g ; F1 Mode 
Ct 670 F2 Mode 


=) 


=) 


Ct 655 


Access Time from CAS # Active 
Access Time from CAS # Active 


Access Time from CAS # Active 
Read Data Hold from CAS # 


NiwWl/MmMlIyNlolsS 


| ct67t Ye PutgeWidth ns | P1 Mode 
| Ct673 | E@recharge Pulse Width F1 Mode 
Ct 674 | tRP Precharge Pulse Width F2 Mode 


Ct 675 tRP RAS # Precharge Pulse Width 


| Ct677 tCP CAS # Precharge Pulse Width 
Ct678 | tCP | CAS# Precharge Pulse Width 
cP 


P1 Mode 
F1 Mode 
F2 Mode 
| P1 Mode 
F1 Mode 
| F2 Mode 
P1 Mode 
F1 Mode 
2 Mode 
1 Mode 
1 Mode 
F2 Mode 


— | 
o1 | O1 


he 
oy 
2% 
fs 
1 


Ct679 | t CAS# Precharge Pulse Width 

= : 
= 
= 


Ct 681 
Ct 682 
Ct 683 
Ct 685 
Ct 686 
Ct 687 
— Ct689 
Ct 690 


— 
on 


N 


(PAW | PAR&# Hold rom CAS Activ (Writ) | 
Pa 2 
Pa 


tPVR PARx# Valid from CAS # Active (Read) | 27 


n 


ae) 


NO | NM 
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.6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


386 SL CPU A.C. Specifications (Continued) 


Parameter 


DRAM Mode: 20 MHz Timings (Continued) 


Ct6ot PARx# Valid from CAS # Active (Read) 


Ct693. | tPHR PARx# Hold from CAS # Inactive (Reg 
Ct694 | tPHR PARx# Hold from CAS # Inactive 


Ct 695 tPHR 


Parity Error 


ovr | ED 


ct705 | twRP | Ree Holathern & 

ct7os | tRDS. | RAgeo LK 
2AM DMA/Master) -_ 

- Ct 707 tADS - _ Address Valid Delay from SYSCLK 
(DRAM DMA/Master) 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


fos] 


Symbol 


— 
-- 
ok 


+ | oe 
G | Ph 


It4 
tS 
It5a 


it6a - 


— 
N 


it8a 
| It8b 


lt9a 
lt10a 
It10b 
t11 
It14 
It15 


Iti5a 
Iti6 


Iti6a 
It17 
It18 
Iti9 
It20 
It20a 
1t21 
It22 


lt22a 


It23 


Alt Symbol 


2360SL I/O Timing Specifications Summary 


(During Resume after Suspend) 

A20GATE Active (HIGH) Delay from SYSCLK & 
SYSCLK to KBDCLK Delay 
RC#/PERR#/IOCHCK# Pulse Width 


RC#/PERR #/IOCHCK# Setup to 
SYSCLK Falling Edge 


CPURESET Active (HIGH) from S¥SQ.8 
NMI Active (HIGH) from SYSGLK 

NMI Inactive from IOW # Agay 
‘RTCRESET # Pulse Width” 


DMA8/16# Inactive Delay from SYSCLK 

(4 MHz DMACLKk) 
DMA8/16# Inactive Delay from SYSCLK Low 

(8 MHz DMACLKk) 


AEN Active from HLDA Active 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


0 


¢ 


2360SL !/O Timing Specifications Summary (Continued) | me 


[Symbol] Aitsymbol| ____—_—_—~Parameter_—==«([ Min | Max] Unit| Notes 
It24 ae AEN Inactive Delay from HLDA Inactive | | 35 | ns | | 
2a AEN Inactive from SYSCLK ee eee a 
25 | ~~ _| SA15:0, SBHE # Valid Delay from SYSCLK }10 | 100} ns | 
It26 SA16 (Only if DMA8/16# = 0) SA15:0, | 

SHBE # Valid Output Hold from SYSCLK — | 
It26a SA16 (Only if DMA 8/16# = 1), 10 . 
LA17:23 Valid Output Hold from 
lIOR #/IOW #/MEMR #/MEMW # Output 
It26f SA16:0, LA17:23, SBHE # Float Delay | 
from SYSCLK | x 
It27 DACKx# Active Delay from SYSCLK a 75 
(4 MHz DMACLK) - 
It27a DACKx# Active Delay from SYSCLK Low aw 75 
(8 MHz DMACLK) . | a 
It28 DACKx# Inactive Delay from SYS 75 
| (4MHzDMACLK) ~~ 
DACKx# Inactive Delay fro 75 
(8 MHz DMACLK) 
IOR #/IOW#/MEMW Je ‘| 75 
Inactive from SYSCIst , * | 
so | | | 75 {ns | 
30a | | | 75 | ns | 
tet | ! Re ae 
It32a Rput Inactive Delay from SYSCLK | [75] ns} | 
Ke . {| [eins | 
134 T/C f§ttive Delay from SYSCLK | | 85 | ns || 
| It35 TIM2CLK2 Period 125] |ns| | 


It36 
It37 
It38 
It39 
It40 


TTimacikatowTime——SSSSSSCSCS~*dS 
TTiMecik2High Timed 


or}; on 

oy, on 
PO 1 DO 
an]; ao 


a 


a 
Ol 
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82360SL I/O Timing Specifications Summary (Continued) 


ce) 


It110. 


— 
| | TIM2GAT2 Low Pulse Width 
TIM2GAT2 Setup to TIM2CLK2 | 


TIM2GAT2 Hoid from TIM2CLK2 
TIM2OUT2 from TIM2CLK2 High to Low 


aoe 


Be i iG ¢ 
WOE 4 2 
io 
te ; 
ie op 
DD 
on 
hy 
i 
Z 14 
Le EE y 
he B f 
e Be 
Yi 
4 i, a; 
7 Ee y , 
? i 4 iy. 
4 fs ita 
mi. 029 oe 
bf i, Z 
° 
fi . 
yy 
< Ye, 
% 
& 
Me " oe 
z Se og 
Los * 
wy 


SPKR Active Delay from TIM2GAT2 
(When EXTAUD is Set) | 


KBDCLK Period (2 MHz) 7 | 
KBDCLK High Time (4 MHz) | 


ARR, 
O17 O1/ MO 
— 
— 
oO 


TIM20UT2 from TIM2GAT2 High to Low 


— 
— 
© 


| : 
ie) 
2) 


REFRESH # Active to MEMR# Output Active ; 150 
Address Valid to MEMR # Active 40 


ooh, 
i 
Le ty 
" 2 
Ly 
ye 
oe Za 


ae 
— | 
O;oO 
O|oO 


—- 
oO 
O 


IOCHRDY Output from SYSCLK 


Delay from |OW # to Modem Output | 
(RTS#, DTR#) ; 
KBDCLK Period (8 MHz) | 
KBDCLK Period (4 MHz) 2 


on 
oO; 0 
oO;}o 


KBDCLK High Time (8 MHz) 


oO; + 
ao} © 


KBDCLK High Time (2 MHz) 


N 
=) 
© 
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82360SL 1/0 Timing Specifications Summary (Continued) | 


KBDCLK Low Time (4 MHz) _ : 95 


It117 


INTR Output Delay from IRQ1, 6, 10: 11, 12, 14 
ERROR # 


a ADS ses 


= 
DD 
© 
> 
pe] 
Q 
=, 
< 
Oo 
~- 
Oo 
ae 
r~ 
O 
> 
=| 
m 
Q 
rant 
: < 
4) 
oOo; 
wo ;O 


a ro) re) ND ; = 


a iN 
n ro) 


st 

— 

NO 

i) 
ee 


- G — 
ol © oi 


It200 
1201 
1t202 
t203 
1t204 
It204f 
1t205 
t206 


BAS 


11207 AEN Active from SYSCLK during _ | 
| | Indexed I/O Writes 


oO Oo}; — 
io) hp | o1 
ie) 


— 
wo 


8 
© 


XD7 Output Valid from IOW # Active 


t209 
[t210 


lOW # to EXTRTCAS 


palo 
© 
BE 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


82360SL I/O Timing Specifications Summary (Continued) 


XD7 Output Hold from IOW # Inactive 
XD7 Output Float from lOW # Inactive 


EXTRTCRW #/EXTRTCDS Active from 
Command Active | 

11213 EXTRTCRW #/ EXTRTCDS Hold from 
Command Inactive 


Symbol 
It211 
It211f 
1t212 


1?) 


1t214 XDEN # Output Delay from IOR#/IOW#, 
MEMR #¥ Inputs 


XDEN# Output Delay from DACK2 Outp) 
XDEN # Output Active from XDIR Oujgut 


XDIR Output Inactive from XDEN# 
Output Inactive 


SD7 Read Data Output Dela 
XD7/HD7 Input 


ESS 
x 
Kc 
RS 


lt214a 
It215a 


It215b 10 


=) 
1¢7) 


oO1 ol Go ao; +s 
oe) o) oO on} O1 


oe 
Te 
Yi 
Le, 
Pe. 
GS 4 
oa 
BS 
ge 
2 He 
F: ,s 
tits 


It216 


=) a) 
” 


It217 
It217f 
It218 
It219 


iz 
on 
=) 
” 


it219a 


It22 & 
It221 
It221f 
It222 
It223 
It224 
It225 
It230 
It231 
It231f 


: put Hold from IOW # Inactive 
HD7 Output Float from IOW # Inactive 
HALT # Input Setup to SYSCLK Low 


[x07/H07 input Hold fom IOR#/MEMR® 


On or oO or 


© 
G ioe) 
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6.0 SL SuperSet TIMING SPECIFICATIONS (Continued) 


82360SL I/O Timing Specifications Summary (Continued) 


[aitsymbol | —___—_—Parameter—— | Min | 
Ra Output Active rom SYSCLK | 


HRQ Inactive from SYSCLK | 5 | 


—-1t805 
It311 
| It312 
It314 
It317 
It319 
It322 


‘ 


& 


It324 
It325 
It326 
It327 
It328 


It329 kPulse Width during Master# Cycle 
It330 | 


[DAGKKA IOMASTER# Delay | 0 


=) —s wa ‘ . 
EREG Active 


F 
SS 


Se 


oo oo ALS Pl 7 
5 | 
a 
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7.0 SL SuperSet TIMING DIAGRAMS 


7.1 386T™ SL CPU Timing Diagrams 


3.7V 


EFI 
2Vv 


(Vee pe 0.8)V 
3.7V 
NPXCLK 


re—— Ct443p ——> 


240814-9 


ISACLK2. 


SYSCLK 
1.5V 


240814-10 


Figure 7.1.2. Clocks 


5-801 


intel " 386™ SL MICROPROCESSOR Superset ADVANCE INFORMATION 


7.1 386™ SL CPU Timing Diagrams (Continued) 


1 
NPXCLK | 
t 


Ctgay! 


CA2 PINT a CT CATNTTETNTT 


eae ae ae ae ee ee ae 
aoa a ee CT RaS 
NPXADS# | ctaes | = 3 


cb15~cD0 aoe oe ene ete 


Ct4rsa 


a 


' 

NPXRDY# | ' 
' 

' 

’ 1 

' 


240814-11 _ 


; a j ie 2 i ' 
: TI TRS | , 
NPXCLK \ . \ 

) tyr! ! ! ! | ! 3 | 
a2 V/V LLL SU HOE 8 CTT VZZIZLL NUNN, | 

aL Pe 

NPXW/R# ! <> | ! | a ZAAEEE 
NPXADS# | _ ay | 
Bey TTT 
| Lae eee 


cb15-cD0 oP tt tp pt em PP 


| 
t 
| 
| 
! 


NPXRDY# 


a fa ee eh: a > ee ee 
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Figure 7.1.4. 386 SL CPU Write to MCP 
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7.1 386™ SL CPU Timing Diagrams (Continued) 


! I ! 
WRITE , WRITE 
\ ' ' ’ 


mu FULL LAA AA 
! ] mt ' ay me ] 7 t 
wexaos# | / Ctanse! Ctyosp | : | 


\ i 
' ' 
{ \ 
NPXRDY# | ' 1 
| Cta260 ' Ctazen | 
\ ' \ \ ' 1 
\ \ , ! ; ' | ; ' \ ' \ 
' ! | \ ' ! ! J ! 
BUSY# ) : | : ) | 
' ' \ ' ' 1 
' 
a ee 
PEREQ + ) é / : 
sentenced semmritimmnosttomschrman (everson 
\ ' ' { 1 \ ' 
; NOTE 1) NOTE 2) NOTE 3 | NOTE 1) 
240814-13 
NOTES: 


1. Instruction dependent. . 

2. PEREQ is an asynchronous input to the 386 SL CPU. Instruction dependent as to when it is asserted. 
3. Additional operand transfers. 

4. Memory read (operand) cycles not shown. 


Figure 7.1.5. MCP BUSY # and PEREQ Timings 


INTERNAL 
CLK2 


CPUCLK 


Ct4og 


QQ READ YY KK waite PY KK READ YY 


240814-14 


Figure 7.1.6. Cache Read/Write Hit Cycles 
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7.1 386T SL CPU Timing Diagrams (Continued) 


VGACS# 


 PW/R#, PM/10¥ : , _ , 


note 1 note 1 note J 
SA(16:1) 


wis — | Xe | XXX Xone 
SAO, SBHE# 
ee ee note 1 note 1 ~ note 1 ; 


: | ae | X Xk) 
SA(19:17) = VALID 1 | KX X VALID 2 
PSTART# 


PCMD# 


PRDY# 


Ctz06 3 7 / 


s.180) Se ED CITT) ANNNNYY 
Ce ED ANNNY GC CCCCMD ANNAN 


| Ctz08 
240814-15 


Figure 7.1.7. PI-Bus Timings 
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7.1 386™ SL CPU Timing Diagrams (Continued) 


EFI 


PW /R# 
PM /lO# 
VGACS# 


SA(16:1) 
LA(23:17) 
SAO 
SBHE# 


SA(19:17) 


PSTART# 


PCMD# 


PRDY# 


SD(15:0) 


0, B® 


{FS 


< 
> 
cS 
i] 


3 


, 22 a, 


2, Bo 


, VALID 2, 


2, ~ % 


1 I J I 
, VALID 1 | |X | VALID2 


1 ' { | 
' VALID | |X, VALID 2, 


' { 
Ctzag | Ct 471 


|X | VALID 3, 


| ' VALID 3 


Mi 


Figure 7.1.8a. Pl Bus Synchronous CPU Generated Cycles (Read) 
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7.1 386T™ SL CPU Timing Diagrams (Continued) 
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$0(15:0) Sane sety- a a4-: aay ——: = 
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Figure 7.1.8b. Pl-Bus Synchronous CPU Generated Cycles (Write) _ 
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7.1 386™ SL CPU Timing Diagrams (Continued) 


wast) AY VALID 


ROMCS1# 


ROMCSO#, ae CE GX 
}4—_-_______+| 


240814-17 


Figure 7.1.10. ISA Bus Slave Controller Generated Timings (ROMCS0 #/CS1# with respect to 
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Figure 7.1.11. ISA Bus Master Controller Generated Timings (VGACS# with respect to Address) 
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Figure 7.1.12. ROMCSO, ROMCS1, SMRAMCS # Propagation Delays 
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Figure 7.1.13. ISA Bus 8-Bit Memory Read/Write Standard ISA BUS Cycle (6 SYSCLKs) 
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Figure 7.1.14. ISA Bus 8-Bit Memory Read/Write with ZEROWS# Asserted (3 SYSCLKs) 
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Figure 7.1.15. ISA Bus 8-Bit Memory Read/Write with IOCHRDY De-Asserted (Added Wait States) 
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Figure 7.1.16. ISA Bus 8-Bit |/O Read/Write Standard ISA BUS Cycle (6 SYSCLKs) 
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Figure 7.1.17. ISA Bus 8-Bit I/O Read/Write with ZEROWS# Asserted (3 SYSCLKs) 
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Figure 7.1.18. ISA Bus 8-Bit I/O Read/Write with IOCHRDY De-Asserted (Added Wait States) 
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_ Figure 7.1.19. ISA Bus 16-Bit Memory Read/Write Standard ISA BUS Cycle (3 SYSCLKs) 
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Figure 7.1.20. ISA Bus 16-Bit Memory Read/Write with ZEROWS# Asserted (2 SYSCLKs) 
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Figure 7.1.21. ISA Bus 16-Bit Memory Read/Write with IOCHRDY De-Asserted (Added Wait States) 
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Figure 7.1.22. ISA Bus 16-Bit 1/O Read/Write Standard ISA BUS Cycle (3 SYSCLKs) 
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Figure 7.1.23. ISA Bus 16-Bit |[/O Read/Write with IOCHRDY De-Asserted (Added Wait States) 
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Figure 7.1.24. ISA Bus Interrupt Acknowledge Bus Cycle 
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Figure 7.1.26. ISA Bus Controller Refresh Cycle 
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Figure 7.1.28. ISA Bus External Bus Master to Off-Board I/O Ports (No Byte-Swapping) 
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Figure 7.1.30. 386™ SL CPU Memory Controller Timings 
(DRAM F1 Mode Timing Parameters) 


5-827 


rid 386™ SL MICROPROCESSOR Superset ADVANGE INFORMATION 


7.1 386™ SL CPU Timing Diagrams (Continued) 


° ‘ ava 
be wt CLE 3 w 


TIP 


CYCLE 0 cyc 


EFI 


CPU ADDR + CTL hate: ADDRESS 1 ome CPU =— 2 gli ADDRESS 3 


MA(10: 0) saa 1) 


oad 
m 


rt 


He =e 


tf Cts19 
Ez; wl 
Efe Cts97 
DEN¥ a | | 
ae C517 
MD(15:0) WR DATA nee emi. DATA See WR DATA | | a, 


i 


CE 


. 


Ww 


OE# 


Ube 
3 
baste 
Tal 
tv 
w 


DIR 


240814-38 


Figure 7.1.31. 386™ SL CPU Memory Controller Timings 
(SRAM Mode Timing Parameters; 2 Wait States) 
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Figure 7.1.32. 386™ SL CPU Memory Controller Timings 
(CAS# before RAS# Refresh Timings) 
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Figure 7.1.33. REFRESH, DMA/MASTER Timing Diagrams 
(Address Active Delay from SYSCLK) 
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Figure 7.1.34. REFRESH, DMA/MASTER Timing Diagrams 
(RAS # Active Delay from SYSCLK) 
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Figure 7.1.35. PERROR Timing Diagram 
(PERR # Active Delay from SYSCLK) 
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Figure 7.2.1. CPURESET, NMI, AZOGATE and RC# Timings 
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Figure 7.2.2. Clock Timings 
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Figure 7.2.3. ISA Bus 8-Bit [/O Read/Write Default Bus Cycle (6 SYSCLKs) 
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Figure 7.2.4. iSA Bus 8-Bit |1/O Read/Write Compressed Bus Cycle 
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Figure 7.2.5. DMA Controller Timings 
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' Figure 7.2.6. Refresh Arbitration Timings 
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_ Figure 7.2.7. DMA Memory Write Timings 
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Figure 7.2.8. DMA Memory Read Timings 
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Figure 7.2.9. Bus Master Refresh Cycle Timings 
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Figure 7.2.10. ISA Bus Master Refresh Cycle with IOCHRDY Timings 
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_ Figure 7.2.11. X-Bus Control Signals—Memory Read Timings 
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Figure 7.2.12. X-Bus Control Signals—Memory Write Timings 
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Figure 7.2.13. X-Bus Control Signals—I/O Read Timings 


5-840 


Ly8-S 
sHulwis 814M O/I—sleuBls [01.U05 sng-x “ph'z'Z eunbi4 


TS | a oe aT 


SYSCLK ! Coa en, 
ae 
! 
-_ 
ner oo e205 1 
: oar aes, 
l 


SA(15:0) | Yc as Rae VALID ADDRESS cana 


S07 aa WRITE DATA a 
| 
I 
; | 
low#¢ 


1 
I 
! 
y \ 
| 
x07 | eas WRITE DATA eens S- 
{ 
| 
I 
| 


FLPCS#, oa 
C8042CS¥ | 
| : 
; : | 
I 
| 
XDEN# | 


| 
t 
| 
XDIR | low | | | 


240814-66 


(penujuoo) swesbeig Bulwi, WSQ9EZS ZZ 


18SJ9dNS HOSSSOOUdOHYDIWN IS wi98E 


NOILVYNEOINI JONVAGY 


ADVANGE INFORMATION 


_ 386™ SL MICROPROCESSOR SuperSet 


7.2 82360SL Timing Diagrams (Continued) 


9S-rl 8002 


— 


et seed | foment Bim 


SSaudav GrvA _H____{| 


SL 


haan 


' 902, 


+>, 
aL, ) 


ne 


StL 


ylax 
#N30X 
#MYOLYLXS 


SGODLYLX3 


SVOLYLX3 


#MOI 


(o:Z)as 


(O:S1)vsS 


ava 


MTOSAS 


Figure 7.2.15. I/O Port 70 Hex Write—External RTC Timings 
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Figure 7.2.16. |/O Port 71 Hex Read—External RTC Timings 
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Figure 7.2.17. 1/O Port 71 Hex Write—External RTC Timings 
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Figure 7.2.19. |.D.E. Hard Disk Control Signals—!/O Write Timings 
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Figure 7.2.21. Interrupt Controller Timings 
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Figure 7.2.22. Programmable Interval Timer/Counter Timings | 


5-848 


Intel = sse™ si microprocessor superset ADVANCE INFORMATION 
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Figure 7.2.23. RESETDRV, DMA8/16 #, Command Signals and IOCHRDY with Respect to SYSCLK 
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Figure 7.2.24. AEN and HALT with Respect to SYSCLK 
XDEN# and IOR#/IOW# with respect to LA17-23 
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Figure 7.2.25. System Power Management Controi Signal Timings 
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8.0 CAPACITIVE widuaiiba 
INFORMATION 


In the timing diagrams shown in the previous sec- 
tion, all maximum timings specified are at a maxi- 
mum value of capacitive load tested on the signal 
pin. This maximum value is different for different pins 
and can be obtained for each pin from the pin as- 
signment table in section 2. The delay introduced to 
signal transitions at the maximum specified load will 
be called the nominal delay. If, however, either a 
lighter or heavier capacitive load is connected to a 
pin, signal delay will change. To allow the system 
designer to account for such loading differences, ca- 
pacitive derating curves have been provided in this 
section. 


The derating curves for different pins depend on the 
internal buffers used. Nine derating curves are pro- 
vided to account for the various classes of internal 


buffers used with different delay characteristics. To - 


use these derating curves, follow the procedure out- 
lined here. 


1. From the Pin assignment chart, find the letter in 
the column “Derating Curve” ConmresRenaing to 
the signal under consideration. 


| 2. In this section, find the geraung curve of the cor- 
rect type. 


3. Calculate the capacitive loading on the sienal un- 
der consideration. 


4. Find this load point on the capacitive load axis of 
the derating curve. 


5-852 


386™ SL MICROPROCESSOR SuperSet 


ADVANCE INFORMATION 


. Project a vertical line to the derating curve from 


the load point and draw a horizontal line from the 
point the vertical line intersects the curve. 


. Estimate the amount of time from the nominal 


point to the point where the horizontal line meets 
the delay axis. This is the derating value. 


. If the point where the horizontal meets the delay 


axis is above the nominal value, then this derating 
value should be added to signal timings shown in 
the timing diagrams. If the horizontal meets the 
delay axis below the nominal value, the derating 
value should be subtracted from the signal tim- 
ings. 


. The derating curves shown can be used in identi- 


cal manner for both rising and falling edges of the 
signal. 
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Capacitive Derating Curves (Continued) 


9.0 DAMPING RESISTOR 
REQUIREMENTS 


The SL SuperSet has powerful output buffers capa- 
ble of directly driving large loads. These buffers are 
designed for fast signal transition times and hence 
have low output impedence. Due to a mismatch be- 
tween the output impedance of the buffers and the 
characteristic impedance of the load (trace capaci- 
tance and the total number of devices) voltage over- 
shoot and ringing can occur at signal transitions. By 
matching the output impedance with the characteris- 


tic input impedance and avoiding long trace lengths, 
the system designer can minimize the transmission 
line reflections and ringing. 


The ringing at signal transitions of address and data 
lines cause long unstable periods. Ringing on con- 
trol signals can cause false latching. To minimize the 
ringing effect series damping resistors may have to 
be connected. For additional hardware system de- 
sign information, consult see the 386™ SL Micro- 
processor SuperSet System Design Guide (Intel Or- 
der # 240816). | 
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40.0 MECHANICAL DETAILS OF LGA 
"AND PQFP PACKAGES 


iInis section contains Mechanic 
Ss 


c i 
types of packages used in the SL 


design the parts in. For more detailed information on 
packages and package types, please refer to 
“Surface Mount Technology Guide” (Order 
#240585) | 
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Figure 10-1a. Principal Dimensions of the 386™ SL CPU in a 227-Lead LGA Package 
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Figure 10-b. Recommended LGA Socket Footprint 


| | Veg CONNECTIONS 
OoOoooo0oo000000800000000 
NoOoooooooOoOoOooOoo00000000 MI vss CONNECTIONS 
QOOOS8000s80HbsSO0HhsesoaB8oo0000 
[_] No Connects 


=] CLOCK PIN 
. 240814-91 


TOP VIEW 
OF 
PCB LAND PATTERN 


DOCOOOeseososesobososcoo 
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240814-90 
| @ All power pins can be routed towards the middle. 
¢ Outer two rows route outward 


¢ Clock pins should have shortest possible traces, then via to shielded inner layer. 


Figure 10-1c. Recommended Signal Routing for LGA Package 
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240814-92 
_ Figure 10-2a. Principle Dimensions of the 82360SL I/O in the 196-Lead PQFP Package 
Family: 196-Lead Plastic Quad Flat Package (PQFP) 0.025 Inch (0.635mm) Pitch 


Miillimeters 


A = Package Height: Distance 
from seating plane to highest 
point of the body 


A1 = Standoff: Distance from 
Seating Plane to Base Plane 


D/E = Overall Package Dimension: 
Lead Tip to Lead Tip 


D1/E1 = Plastic Body Dimension 


D3/E3 = Lead Dimension 
D4/E4 = Foot Radius Location 


L1 = Foot Length 


NOTES: 

1. All PQFP case outlines are being presented. as standards to the JEDEC. 

2. Typical board footprint area for the 196-lead PQFP is 1.500 inches x 1.5000 inches. 

3. All dimensions and tolerance conform to ANSI Y14.5M-1982. . 

4. Datum Plane -H- located at the molding parting line and coincident with the bottom of the lead where the lead exits the 
plastic body. 

5. Datums A-B and -D- to be determined where the center lead exits the plastic body at datum plane -H-. 

6. Controlling dimension in inches. 

7. Dimensions D1, D2, E1, and E2 are measured at the molding parting line and do not include mold protrusions. 
8. Pin 1 identifier is located within one of the two zones indicated. 

9. Measured at datum plane -H-. 


10. Measured at seating plane datum -C-. 
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Figure 10-2c. Detailed Dimensions of the 82360SL I/O in the 196-Lead—Terminal Details 
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Figure 10-2d. 196-Lead PQFP Mechanical Package Detail—Typical Lead 
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Figure 10-2e. 196-Lead PQFP Mechanical Package Detail—Protective Bumper 
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Figure 10-2f. Recommended PQFP Footprint 


11.0 REVISION HISTORY 

The First Release of the Advanced Information Data . 
Sheet reflects information believed to be accurate 
as of September 1990. 


Please Consult your Local Intel Field Sales Office for 
the most current design-in information. 
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@ Full 32-Bit Internal Architecture 


— 8-, 16-, 32-Bit Data Types 
— 8 General Purpose 32-Bit Registers 


m Runs Intel386™ Software in a Cost 
Effective 16-Bit Hardware Environment 
— Runs Same Applications and O.S.’s 

as the 386™ DX Processor | 
— Object Code Compatible with 8086, 
80186, 80286, and 386 Processors 
— Runs MS-DOS*, OS/2* and UNIX** 


m Very High Performance 16-Bit Data Bus 
— 16 MHz and 20 MHz Clock 
— Two-Clock Bus Cycles 
— 20 Megabytes/Sec Bus Bandwidth 
— Address Pipelining Allows Use of 
Slower/Cheaper Memories | 


m Integrated Memory Management Unit 
— Virtual Memory Support 
— Optional On-Chip Paging 
— 4 Levels of Hardware Enforced 
Protection 
— MMU Fully Compatible with Those of 
the 80286 and 386 DX CPUs 


m Virtual 8086 Mode Allows Execution of 


8086 Software in a Protected and 
Paged System 


386™ SX MICROPROCESSOR 


m Large Uniform Address Space 
— 16 Megabyte Physical 
— 64 Terabyte Virtual 
— 4 Gigabyte Maximum Segment Size 


m High Speed Numerics Support with the 
387™ SX Coprocessor 


m On-Chip Debugging Support Including 
Breakpoint Registers 


m= Complete System Development 
Support | 
_-— Software: C, PL/M, Assembler 

— Debuggers: PMON-386 DX, 
ICETM-386 SX 

— Extensive Third-Party Support: C, 
Pascal, FORTRAN, BASIC, Ada*** on 
VAX®t, UNIX**, MS-DOS*, and Other 
Hosts 


High Speed CHMOS III and CHMOS IV 
Technology 


Operating Frequency: 
— Standard (386T SX -20, -16) 
Min/Max Frequency (4/20, 4/16) MHz 
— Low Power (386™ SX -20, -16, -12) | 
| Min/Max Frequency (2/20, 2/16, — 
2/12) MHz 


100-Pin Plastic Quad Flatpack Package 


See Packaging Outlines and Dimensions #231369) 


The 386™ SX Microprocessor is a 32-bit CPU with a 16-bit external data bus and a 1 24- bit external address 
bus. The 386 SX CPU brings the high-performance software of the Intel886™ Architecture to midrange 
systems. It provides the performance benefits of a 32-bit programming architecture with the cost savings 
associated with 16-bit hardware ot dahl 
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1.0 PIN DESCRIPTION 
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Figure 1.1. 386™ SX Microprocessor Pin out Top View 


Table 1.1. Alphabetical Pin Assignments 
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1.0 PIN DESCRIPTION (Continued) 


The following are the 386™ SX Microprocessor pin descriptions. The following definitions are used in the pin 
descriptions: 


# The named signal is active LOW. 
| Input signal. 

O Output signal. 

I/O ‘Input and Output signal.. 

- No electrical connection. 


Name and Function 


CLK2 15 CLK2 provides the fundamental timing for the 386™ SX 
: Microprocessor. For additional information see Clock. | 


RESET suspends any operation in progress and places the 
386™ SX Microprocessor in a known reset state. See Interrupt 
Signals for additional information. 


81-83,86-90, 
92-96,99-100,1 


Data Bus inputs data during memory, I/O and interrupt 
acknowledge read cycles and outputs data during memory and 
1/O write cycles. See Data Bus for additional information. 


80-79,76-72,70, Address Bus outputs physical memory or port |/O addresses. 
66-64,62-58, See Address Bus for additional information. 


56-51,18 


Write/Read is a bus cycle definition pin that distinguishes write 
cycles from read cycles. See Bus Cycle Definition Signals for 
additional information. 


Data/Control is a bus cycle definition pin that distinguishes data 
cycles, either memory or |/O, from control cycles which are: 
interrupt acknowledge, halt, and code fetch. See Bus Cycle 
Definition Signals for additional information. 


Memory/IO is a bus cycle definition pin that distinguishes 
memory cycles from input/output cycles. See Bus Cycle 
Definition Signals for additional information. 


Bus Lock is a bus cycle definition pin that indicates that other 
system bus masters are not to gain control of the system bus 
while it is active. See Bus Cycle Definition Signals for 
additional information. : 


Address Status indicates that a valid bus cycle definition and 
address (W/R#, D/C#, M/IO#, BHE#, BLE# and Ao3-A, are 
being driven at the 386™ SX Microprocessor pins. See Bus 
Control Signals for additional information. | 


NA# Next Address is used to request address pipelining. See Bus 
Control Signals for additional information. 
19,17 


READY # Bus Ready terminates the bus cycle. See Bus Control Signals 
for additional information. 
BHE#, BLE # ; Byte Enables indicate which data bytes of the data bus take part 


in a bus cycle. See Address Bus for additional information. 
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1. 0 PIN DESCRIPTION (Continued) 


HOLD ! Bus Hoid Request input aliows another bus master to request 
control of the local bus. See Bus Arbitration Signals for 
additional information. 


Bus Hold Acknowledge output indicates that the 386™ SX 
Microprocessor has surrendered control of its local bus to 
another bus master. See Bus Arbitration he for additional 
information. | 


Interrupt Request is a maskable input that signals the 386™ SX 

Microprocessor to suspend execution of the current program and 

| execute an interrupt acknowledge function. See Interrupt 
Signals for additional information. 


Non-Maskable interrupt Request is a non-maskable input that 
- signals the 386™ SX Microprocessor to suspend execution of 
_ | the current program and execute an interrupt acknowledge 

| function. See Interrupt Signals for additional information. 


Busy signals a busy condition froma processor extension. See 
Coprocessor interface Signals for additional information. 


_| Error signals an error condition from a processor extension. See 
Coprocessor Interface Signals for additional information. 


Processor Extension Request indicates that the processor has 
data to. be transferred by the 386™ SX Microprocessor. See 
Coprocessor Interface Signals for additional information. 


Float is an input which forces all bidirectional and output signals, 
including HLDA, to the tri-state condition. This allows the 
electrically isolated 386SX PQFP to use ONCE (On-Circuit 
Emulation) method without removing it nom the PCB. See Float 
for additional information. 7 


N/C 


20, 27, 29-31, 43-47 | No Connects should always be left unconnected. Connection of 
a N/C pin may cause the processor to malfunction or be . 
incompatible with future ebm of hie 386T SX _ 


Microprocessor. | 


8-10,21,32,39 
42,48,57,69, 
71,84,91,97 


2,5,11-14,22 
35,41,49-50, 
63,67-68, 

77-78,85,98 


System Power provides the + BV nominal DC supply input. 


System Ground provides the OV connection from which all 
inputs and outputs are measured. 
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Figure 2.1. 386™ SX Microprocessor Registers 
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__ INTRODUCTION | | 
The 386 SX Microprocessor is 100% object code. 


compatible with the 386 DX, 286 and 8086 micro- 
processors. System manufacturers can provide 386 
DX CPU based systems optimized for performance 
and 386 SX CPU based systems optimized for cost, 
both sharing the same operating systems and appili- 
cation software. Systems based on the 386 SX CPU 


can access the world’s largest existing microcom- ~ 
puter software base, including the growing 32-bit . 
software base. Only the Intel386 architecture can. 


-run UNIX, OS/2 and MS-DOS. 


Instruction pipelining, high bus bandwidth, and a 
very high performance ALU ensure short average 
instruction execution times and high system 
throughput. The 386 SX CPU is capable. of execution 
at sustained rates of 2.5-3.0 million instructions per 
second. | 


The integrated memory management unit (MMU) in- 
cludes an address translation cache, advanced mul- 
ti-tasking hardware, and a four-level hardware-en- 
- forced protection mechanism to support operating 
systems. The virtual machine capability of the 
386 SX CPU allows simultaneous execution of appli- 


cations from multiple operating systems such as 


MS-DOS and UNIX. 


The 386 Sx CPU offers on-chip testability and de- 


bugging features. Four breakpoint registers allow 
conditional or unconditional breakpoint’ traps on 
code execution or data accesses for powerful de- 
- bugging of even ROM-based systems. Other testa- 


bility features include self-test, tri-state of output 


buffers, and direct access to the page" translation 
cache. | | 


The new Low Power 386 SX CPU brings the benefits 
of Intel’s 386 Microprocessor 32-bit architecture to 
the mainstream Laptop and Notebook personal 
computer applications. With its power saving 2 MHz 


sleep-mode and extended functional temperature © 
- range of 0°C to 100°C Tcaseg, the Lower Power 386: 


SX CPU specifically satisfies the power consumption 
and heat dissipation requirements of rey Ss small 
form factor computers. 


2.0 BASE ARCHITECTURE 


The 386 SX Microprocessor consists of a central . 


processing unit, a memory penagement unit and a 
bus interface. 


The central processing unit consists of the execu- 
tion unit and the instruction unit. The execution unit 
contains the eight 32-bit general purpose registers 
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which are used for both address calculation and 
data operations and a 64-bit barrel shifter used to 
speed shift, rotate, multiply, and divide operations. 


The instruction unit decodes the instruction opcodes 


and stores them in the decoded instruction queue 


_for immediate use by the execution unit. 


The memory management unit (MMU) consists of a 
segmentation unit and a paging unit. Segmentation 


allows the managing of the logical address space by 


providing an extra addressing component, one that 
allows easy code and data relocatability, and effi- 
cient sharing. The paging mechanism operates be- 
neath and is transparent to the segmentation pro- 
cess, to allow management of the physical address. 
space. 


- The segmentation unit provides four levels of pro- 


tection for isolating and protecting applications and 
the operating system from each other. The hardware 
enforced protection allows the design of systems 
with a high degree of integrity. 


The 386 SX Microprocessor has two modes of oper- 
ation: Real Address Mode (Real Mode), and Protect- 
ed Virtual Address Mode (Protected Mode). In Real 
Mode the 386 SX Microprocessor operates as a very 
fast 8086, but with 32-bit extensions if desired. Real 
Mode is required primarily to set up the processor 
for Protected Mode operation. 


Within Protected Mode, software can perform a task 
switch to enter into tasks designated as Virtual 8086 
Mode tasks. Each such task behaves with 8086 se- 
mantics, thus allowing 8086 software (an application 


‘program or an entire operating system) to execute. 


The Virtual 8086 tasks can be isolated and protect- 
ed from one another and the host 386 SX Micro- 
processor operating system by use of paging. 


Finally, to facilitate high performance system hard- 
ware designs, the 386 SX Microprocessor bus inter- 
face offers address pipelining and direct Byte En- 
able signals for each byte of the data bus. 


2.1 Register Set 


The 386 SX Microprocessor has thirty-four registers 
as shown in Figure 2-1. These registers are grouped 
into the following seven categories: 


General Purpose Registers: The eight 32-bit gen- 


eral purpose registers are used to contain arithmetic 
and logical operands. Four of these (EAX, EBX, 
ECX, and EDX) can be used either in their entirety as 
32-bit registers, as 16-bit registers, or split into pairs 
of separate 8-bit registers. 
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Segment Registers: Six 16-bit special purpose reg- 
isters select, at any given time, the segments of 
memory that are immediately addressable for code, 
stack, and data. 


Flags and Instruction Pointer Registers: The two 
32-bit special purpose registers in figure 2.1 record 
or control certain aspects of the 386 SX Microproc- 
essor state. The EFLAGS register includes status 
and control bits that are used to reflect the outcome 
of many instructions and modify the semantics of 
some instructions. The Instruction Pointer, called 
EIP, is 32 bits wide. The Instruction Pointer controls 
instruction fetching and the processor automatically 
increments it after executing an instruction. 


Control Registers: The four 32-bit control register 
are used to control the global nature of the 386 SX 
Microprocessor. The CRO register contains bits that 
set the different processor modes (Protected, Real, 
Paging and Coprocessor Emulation). CR2 and CR3 
registers are used in the paging operation. 


System Address Registers: These four special 
registers reference the tables or segments support- 
ed by the 80286/386 SX/386 DX CPU’s protection 
model. These tables or segments are: 


SPECIAL FIELDS: 


i/O PRIVILEGE LEVEL 


NESTED TASK 
17 16 15 


[14 1312 
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GDTR (Global Descriptor Table Register), 
IDTR (Interrupt Descriptor Table Register), 
-LDTR (Local Descriptor Table Register), 
TR (Task State Segment Register). 


Debug Registers: The six programmer accessible 
debug registers provide on-chip support for debug- 
ging. The use of the debug registers is described in 
Section 2.10 Debugging Support. 


Test Registers: Two registers are used to control 


the testing of the RAM/CAM (Content Addressable 
Memories) in the Translation Lookaside Buffer por- 
tion of the 386 SX Microprocessor. Their use is dis- 
cussed in Testability. 


EFLAGS REGISTER 


The flag register is a 32-bit register named EFLAGS. 
The defined bits and bit fields within EFLAGS, 
shown in Figure 2.2, control certain operations and 
indicate the status of the 386 SX. Microprocessor. 
The lower 16 bits (bits O-15) of EFLAGS contain the 
16-bit flag register named FLAGS. This is the default 
flag register used when executing 8086, 80286, or 
real mode code. The functions of the flag bits are 
given in Table 2.1. 


STATUS FLAGS: 


OVERFLOW 
SIGN 

ZERO 

AUX CARRY 
PARITY 


CARRY 
1110 9 
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EMULATE COPROCESSOR 

TASK SWITCHED 


LiL 
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Figure 2.2. Status and Control Register Bit Functions 
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Table 2.1. Flag Definitions © 
Function. | 

0 CF _ Carry Flag—Set on high-order bit carry or borrow; cleared 
otherwise. 


Parity Flag—Set if low-order 8 bits of result contain an even 
number of 1-bits; cleared otherwise. 


Auxiliary Carry Flag—Set on carry from or borrow to the low 
order four bits of AL; cleared otherwise. 
Zero Flag—Set if result is zero; cleared otherwise. 


Sign Flag—Set equal to high-order bit of result (0 if positive, 1 if 
negative). 


2) 


ae | 
nN 


Oo; FF 0 


Single Step Flag—Once set, a single step interrupt occurs after 
the next instruction executes. TF is cleared by the angle step 
interrupt. 


Interrupt-Enable Flag—When set, maskable interrupts will cause 
the CPU to transfer control to an interrupt vector Specie? 
location. : 


Direction Flag—Causes string instructions to auto-increment 
(default) the appropriate index registers when cleared. Setting 
DF causes auto-decrement. 


Overflow Flag—Set if the operation resulted ina carry/borrow 
into the sign bit (high-order bit) of the result but did not result in a 
carry/borrow out of the high-order bit or vice-versa. 


O 


I/O Privilege Level—indicates the maximum Current Privilege 
Level (CPL) permitted to execute I/O instructions without 
generating an exception 13 fault or consulting the |/O permission 
bit map while executing in protected mode. For virtual 86 mode it 
indicates the maximum CPL allowing alteration of the IF bit. See 
Section 4.2 for a further discussion and definitions on various 
privilege levels. 


lOPL 


Nested Task—Set if the execution of the current task is nested 
within another task. Cleared otherwise. 


= 


BS) 


Resume Flag—Used in conjunction with debug register 


breakpoints. It is checked at instruction boundaries before 
breakpoint processing. If set, any debug fault is ignored on the 
next instruction. 


Virtual 8086 Mode—lf set while in protected mode, the 386™ SX 


Microprocessor will switch to virtual 8086 operation, handling 
- segment loads as the 8086 does, but generating exception 13 
faults on privileged opcodes. 


. < 
Ez 
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CONTROL REGISTERS 


The 386 SX Microprocessor has three control registers of 32 bits, CRO, CR2 and CR3, to hold the machine 
state of a global nature. These registers are shown in Figures 2.1 and 2.2. The defined CRO bits are described 
in Table 2.2. 


Table 2.2. CRO Definitions 


Protection mode enable—places the 386™ SX Microprocessor 
into protected mode. If PE is reset, the processor operates again 
in Real Mode. PE may be set by loading MSW or CRO. PE can be 
reset only by loading CRO, it cannot be reset by the LMSW 


instruction. 
1 Monitor coprocessor extension—allows WAIT instructions to 
Cause a processor extension not present exception (number 7). 
2 Emulate processor extension—causes a processor extension 


not present exception (number 7) on ESC instructions to allow 
emulating a processor extension. 


Task switched—indicates the next instruction using a processor 
extension will cause exception 7, allowing software to test 

whether the current processor extension context belongs to the 
current task. 


Paging enable bit—is set to enable the on-chip paging unit. It is 
reset to disable the on-chip paging unit. 


2.2 Instruction Set All 386 SX Microprocessor instructions operate on 
| either 0, 1, 2 or 3 operands; an operand resides in a 
The instruction set is divided into nine categories of register, in the instruction itself, or in memory. Most 


operations: zero operand instructions (e.g CLI, STI) take only 
| one byte. One operand instructions generally are 
Data Transfer two bytes long. The average instruction is 3.2 bytes 
Arithmetic | long. Since the 386 SX Microprocessor has a 16 
Shift/Rotate | byte prefetch instruction queue, an average of 5 in- 
String Manipulation structions will be prefetched. The use of two oper- 
Bit Manipulation 2 ands permits the following types of common instruc- 
Control Transfer tions: 
High Level Language Support 
Operating System Support Register to Register 
Processor Control Memory to Register 

Immediate to Register 
These instructions are listed in Table 9.1 Memory to Memory 
Instruction Set Clock Count Summary. Register to Memory 


Immediate to Memory. 
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The operands can be either 8, 16, or 32 bits long. As 
a general rule, when executing code written for the 
386 SX Microprocessor (32 bit code), operands are 


code (16-bit code), operands are 8 or 16 bits. Prefix- 
es can be added to all instructions which override 
the default length of the operands (i.e. use 32-bit 
operands for 16-bit code, or 16-bit operands for 32- 
bit code). 


2.3 Memory Organization 


Memory on the 386 SX Microprocessor is divided 
into 8-bit quantities (bytes), 16-bit quantities (words), 
and 32-bit quantities (dwords). Words are stored in 
two consecutive bytes in memory with the low-order 
byte at the lowest address. Dwords are stored in 
four consecutive bytes in memory with the low-order 
byte at the lowest address. The address of a word or 
dword is the byte address of the low-order byte. 


In addition to these basic data types, the 386 SX 
Microprocessor supports two larger units of memory: 
pages and segments. Memory can be divided up 
into one or more variable length segments, which 
can be swapped to disk or shared between pro- 
grams. Memory can also be organized into one or 
more 4K byte pages. Finally, both segmentation and 
paging can be combined, gaining the advantages of 
both systems. The 386 SX Microprocessor supports 
both pages and segmentation in order to provide 
maximum flexibility to the system designer. Segmen- 
tation and paging are complementary. Segmentation 
is useful for organizing memory in logical modules, 
and as such is a tool for the application programmer, 
while pages are useful to the system programmer for 
managing the physical memory of a system. 
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ADDRESS SPACES 


The 386. SX Microprocessor has three types of ad- 
dress spaces: logical, linear, and physical. A | 
logical address (also known as a virtual address) 
consists of a selector and an offset. A selector is the 
contents of a segment register. An offset is formed 
by summing all of the addressing components 
(BASE, INDEX, DISPLACEMENT), discussed in sec- 
tion 2.4 Addressing Modes, into an effective ad- 


‘dress. This effective address along with the selector 


is known as the logical address. Since each task on 
the 386 SX Microprocessor has a maximum of 16K 
(214 —1) selectors, and offsets can be 4 gigabytes 
(with paging enabled) this gives a total of 246 bits, or 
64 terabytes, of logical address space per task. The 
programmer sees the logical address space. — 


The segmentation unit translates the logical ad- 


dress space into a 32-bit linear address space. If the 
paging unit is not enabled then the 32-bit linear ad- 
dress is truncated into a 24-bit physical address. | 
The physical address is what appears on the ad- 
dress pins. 


The primary differences between Real Mode and 
Protected Mode are how the segmentation unit per- 
forms the translation of the logical address into the 
linear address, size of the address space, and pag- 
ing capability. In Real Mode, the segmentation unit 
shifts the selector left four bits and adds the result to 
the effective address to form the linear address. 
This linear address is limited to 1 megabyte. In addi- 
tion, real mode has no paging capability. 


Protected Mode will see one of two different ad- 
dress spaces, depending on whether or not paging 
is enabled. Every selector has a logical base ad- 
dress associated with it that can be up to 32 bits in 
length. This 32-bit logical base address is added to 
the effective address to form a final 32-bit linear 
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Figure 2.3. Address Translation 


address. If paging is disabled this final linear ad- 
dress reflects physical memory and is truncated so 
that only the lower 24 bits of this address are used 
to address the 16 megabyte memory address space. 
If paging is enabled this final linear address reflects 
a 32-bit address that is translated through the pag- 
ing unit to form a 16-megabyte physical address. 
The logical base address is stored in one of two 
operating system tables (i.e. the Local Descriptor 
Table or Global Descriptor Table). 


Figure 2.3 shows the relationship between the vari- 
ous address spaces. 


SEGMENT REGISTER USAGE 


The main data structure used to organize memory is 
the segment. On the 386 SX Microprocessor, seg- 
ments are variable sized blocks of linear addresses 
which have certain attributes associated with them. 
There are two main types of segments, code and 
data. The segments are of variable size and can be 
as small as 1 byte or as large as 4 gigabytes (232 
bits). 


In order to provide compact instruction encoding 
and increase processor performance, instructions 
do not need to explicitly specify which segment reg- 
ister is used. The segment register is automatically 
chosen according to the rules of Table 2.3 (Segment 
Register Selection Rules). In general, data refer- 
ences use the selector contained in the DS register, 
stack references use the SS register and instruction 


fetches use the CS register. The contents of the In- 
struction Pointer provide the offset. Special segment 
override prefixes allow the explicit use of a given 
segment register, and override the implicit rules list- 
ed in Table 2.3. The override prefixes also allow the 
use of the ES, FS and GS segment registers. 


There are no restrictions regarding the overlapping 


: of the base addresses of any segments. Thus, all 6 


segments could have the base address set to zero 
and create a system with a four gigabyte linear ad- 
dress space. This creates a system where the virtual 
address space is the same as the linear address 
space. Further details of segmentation are dis- 
cussed in chapter 4 PROTECTED MODE ARCHI- 
TECTURE. 


2.4 Addressing Modes 


The 386 SX Microprocessor provides a total of 8 
addressing modes for instructions to specify oper- 
ands. The addressing modes are optimized to allow 
the efficient execution of high level languages such 
as C and FORTRAN, and they cover the vast majori- 
ty of data references needed by high-level lan- 
guages. 3 


REGISTER AND IMMEDIATE MODES 
Two of the addressing modes provide for instruc- 


tions that operate on register or immediate oper- 
ands: 
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Table 2.3. Segment Register Selection Rules _ 


Type of 
Memory Reference 


implied (Default) 
Segment Use 


Segment Override 
Prefixes Possible 


ani ee Destination of PUSH, PUSHF, INT, | 7 
CALL, PUSHA Instructons SS | 
Source of POP, POPA, POPF, IRET, | 
RET Instructions : Ss 


Destination of STOS, - 
MOVE, REP STOS, and 
REP MOVS instructions 


Other data references, 

with effective address 

using base register of: 
[EAX] 


[EBX] 
[ECX] 
[EDX] 
[ESI] 
[EDI] 
[EBP] 
[ESP] 


Register Operand Mode: The operand is located in 
one of the 8, 16 or 32-bit general registers. 


Immediate Operand Mode: The operand is includ- 
ed in the instruction as part of the opcode. 
32-BIT MEMORY ADDRESSING MODES 
The remaining 6 modes provide a mechanism for 


‘specifying the effective address of an operand. The 
linear address consists of two components: the seg- 


‘ment base address and an effective address. The | 


effective. address is calculated by summing any 
combination of the following three address elements 
(see figure 2.3): 7 


DISPLACEMENT: an 8, 16 or 32-bit immediate val- 
ue, following the instruction. 


BASE: The contents of any general purpose regis- 
ter. The base registers are generally used by compil- 
ers to point to the start of the local variable area. 


INDEX: The contents of any general purpose regis- 
ter except for ESP. The index registers are used to 
access the elements of an array, or a string of char- 
acters. The index register’s value can be multiplied 
by a scale factor, either 1, 2, 4 or 8. The scaled index 
is- especially useful for accessing arrays or struc- 
tures. : ) 


CS,SS,ES,FS,GS 
CS,SS,ES,FS,GS 
CS,SS,ES,FS,GS 
CS,SS,ES,FS,GS 
CS,SS,ES,FS,GS 
CS,SS,ES,FS,GS 
CS,DS,ES,FS,GS 
CS,DS,ES,FS,GS 


Combinations of these 3 components make up the 6 


additional addressing modes. There is no perform- 


ance penalty for using any of these addressing com- 
binations, since the effective address calculation is 
pipelined with the execution of other instructions. 
The one exception is the simultaneous use of Base 
and Index components which requires one addition- 
al clock. , 


As shown in Figure 2.4, the effective address (EA) of 
an operand is calculated according to the following 
formula: ag 


EA = Basépegister + (IndexRegister"Scaling) + Displacement 


1. Direct Mode: The operand’s offset is contained 
as part of the instruction as an 8, 16 or 32-bit 
displacement. . 


2. Register Indirect Mode: A BASE register con- 
tains the address of the operand. 


3. Based Mode: A BASE register’s contents are 
added to a DISPLACEMENT to form the oper- 
and’s offset. : 


4. Scaled Index Mode: An INDEX register’s con- 
tents are multiplied by a SCALING factor, and the 
result is added to a DISPLACEMENT to form the 
operand’s offset. | 
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Figure 2.4. Addressing Mode Calculations 


5. Based Scaled Index Mode: The contents of an 
INDEX register are multiplied by a SCALING fac- 
tor, and the result is added to the contents of a 
BASE register to obtain the operand’s offset. 


6. Based Scaled Index Mode with Displacement: 
The contents of an INDEX register are multiplied 
by a SCALING factor, and the result is added to 
the contents of a BASE register and a DISPLACE- 
MENT to form the operand’s offset. 


DIFFERENCES BETWEEN 16 AND 32 BIT. 
ADDRESSES 


In order to provide software compatibility with the 
8086 and the 80286, the 386 SX Microprocessor 
can execute 16-bit instructions in Real and Protect- 


ed Modes. The processor determines the size of the - 


instructions it is executing by examining the D bit ina 
Segment Descriptor. If the D bit is 0 then all operand 
lengths and effective addresses are assumed to be 
16 bits long. If the D bit is 1 then the default length 
for operands and addresses is 32 bits. In Real Mode 


the default size for operands and addresses is 16 


bits. 


Regardless of the default precision of the operands 
or addresses, the 386 SX Microprocessor is able to 
execute either 16 or 32-bit instructions. This is speci- 
fied through the use of override prefixes. Two prefix- 
es, the Operand Length Prefix and the Address 
Length Prefix, override the value of the D bit on an 
individual instruction basis. These prefixes are auto- 
matically added by assemblers. 


The Operand Length and Address Length Prefixes 
can be applied separately or in combination to any 
instruction. The Address Length Prefix does not al- 
low addresses over 64K bytes to be accessed in 
Real Mode. A memory address which exceeds 
OFFFFH will result in a General Protection Fault. An 
Address Length Prefix only allows the use of the ad- 
ditional 386 SX Microprocessor addressing modes. 


When executing 32-bit code, the 386 SX Microproc- 
essor uses either 8 or 32-bit displacements, and any 
register can be used as base or index registers. 
When executing 16-bit code, the displacements are 
either 8 or 16-bits, and the base and index register 
conform to the 80286 model. Table 2.4 illustrates 
the differences. 
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Table 2.4. BASE and INDEX Registers for 16- and 32-Bit Addresses 


BASE REGISTER 
INDEX REGISTER 


BX,BP 
SI,DI 


SCALE FACTOR 
DISPLACEMENT 


None 


2.5 Data Types 


The 386 SX Microprocessor supports all of the data 
types commonly used in high level languages: 


Bit: A single bit quantity. 


Bit Field: A group of up to 32 contiguous bits, which 
spans a maximum of four bytes. 


Bit String: A set of contiguous bits; on the 386 SX 
Microprocessor, bit Strings can be up to 4 gigabits 
long. 

Byte: A signed 8-bit quantity. 

Unsigned Byte: An unsigned 8-bit quantity. 
integer (Word): A signed 16-bit quantity. 

Long Integer (Double Word): A signed 32-bit quan- 
tity. All operations assume a 2’s compement repre- 
sentation. 


Unsigned Integer (Word): An ncigheal 16-bit 
quantity. 


Unsigned Long iateaas (Double Word): An un- 
signed 32-bit quantity. 


Signed Quad Word: A signed 64-bit aeantiys 
Unsigned Quad Word: An unsigned 64-bit quantity. 


Pointer: A 16 or 32-bit offset-only quantity which in- 
directly references another memory location. 


Long Pointer: A full pointer which consists of a 16- 
bit segment selector and either a 16 or 32- bit Offset. 


Char: A byte representation of an ASCII Alphanu- 
meric or control character. 


String: A contiguous. sequence of bytes, words or 
dwords. A string may contain between 1 vis and - 
gigabytes | 


p16 Bit Addressing 32-Bit Addressing 


0, 8, 16-bits 


Any 32-bit GP Register 
Any 32-bit GP Register 
Except ESP 

~1,2,4,8 

0, 8, 32-bits 


BCD: A byte (unpacked) representation of decimal 
digits 0-9. 


Packed BCD: A byte (packed) representation of two 
decimal digits 0-9 storing one digit in each nibble. 


When the 386 sx Microprocessor is coupled with its 
numerics coprocessor, the 387™ SX, then the fol- 


~_ lowing common floating point types are supported: 


Floating Point: A signed 32, 64, or 80-bit real num- 


_ ber representation. Floating point numbers are sup- 


ported by the 3871™ SX numerics coprocessor. 


Figure 2.5 illustrates the data types supported by the 
386 SX Microprocessor and the 387 SX. 


2.6 I/O Space 


The 386 SX Microprocessor has two distinct physi- 
cal address spaces: physical memory and I/O. Gen- 
erally, peripherals are placed in I/O space although 
the 386 SX Microprocessor also supports memory- 
mapped peripherals. The |/O space consists of 64K 
bytes which can be divided into 64K 8-bit ports or 
32K 16-bit ports, or any combination of ports which 
add up to no more than 64K bytes. The 64K I/O 
address space refers to physical addresses rather 
than linear addresses since I/O instructions do not 
go. through the segmentation or paging hardware. 
The M/IO# pin acts as an additional address line, 
thus allowing the system designer to easily deter- 


- mine which address space the processor is access- 


ing. 


The I/O ports are accessed by the IN and OUT in- 
structions, with the port address supplied as an im- 
mediate 8-bit constant in the instruction or in the DX 
register. All 8-bit and 16-bit port addresses are zero | 
extended on the upper address lines. The |/O in- 
structions cause the M/IO# pin to be driven LOW. 
1/O port addresses OOF8H through OOFFH are re- 
served for use by intel. 
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Figure 2.5. 386™ SX Microprocessor Supported Data Types | 
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_ Table 2.5. Interrupt Vector Assignments 


Interrupt 


Function ‘Number 


Array Bounds Check 
Invalid OP-Code 


Device Not Available 


Double Fault 


Return Address 


instruction Which 


Can Cause 


Faultin 
Exception pa 


Instruction 


Any instruction that can 
generate an exception 


Coprocessor Segment Overrun] 9 [ESC 


*Some debug exceptions may report both traps on the previous instruction and faults on the next instruction. . 


2.7 Interrupts and Exceptions 


Interrupts and exceptions alter the normal program 
flow in order to handle external events, report errors 
or exceptional conditions. The difference between 
interrupts and exceptions is that interrupts are used 
to handle asynchronous external events while ex- 
ceptions handle instruction faults. Although a pro- 
- gram can generate a software interrupt via an INT N 
instruction, the processor treats software interrupts 
as exceptions. 


Hardware interrupts occur as the result of an exter- 
nal event and are classified into two types: maskable 
or non-maskable. Interrupts are serviced after the 
execution of the current instruction. After the inter- 
rupt handler is finished servicing the interrupt, exe- 
cution proceeds with the instruction immediately 
after the interrupted instruction. | : 


Exceptions are classified as faults, traps, or aborts, 
depending on the way they are reported and wheth- 
er or not restart of the instruction causing the excep- 
tion is supported. Faults are exceptions that are de- 
tected and serviced before the execution of the 
faulting instruction. Traps are exceptions that are 
reported immediately after the execution of the in- 
struction which caused the problem. Aborts are ex- 
ceptions which do not permit the precise location of 
the instruction causing the exception to be deter- 
mined. 


Thus, when an interrupt service routine has been 
completed, execution proceeds from the instruction 
immediately following the interrupted instruction. On 
the other hand, the return address from an excep- 
tion fault routine will always point to the instruction 
causing the exception and will include any leading 
instruction prefixes. Table 2.5 summarizes the possi- 
ble interrupts for the 386 SX Microprocessor and 
shows where the return address points to. 
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The 386 SX Microprocessor has the ability to handle 
up to 256 different interrupts/exceptions. In order to 
service the interrupts, a table with up to 256 interrupt 
vectors must be defined. The interrupt vectors are 
simply pointers to the appropriate interrupt service 
routine. In Real Mode, the vectors are 4-byte quanti- 
ties, a Code Segment plus a 16-bit offset; in Protect- 
ed Mode, the interrupt vectors are 8 byte quantities, 
which are put in an Interrupt Descriptor Table. Of the 
256 possible interrupts, 32 are reserved for use by 
Intel and the remaining 224 are free to be used by 
the system designer. 


INTERRUPT PROCESSING 


When an interrupt occurs, the following actions hap- 
pen. First, the current program address and Flags 
are saved on the stack to allow resumption of the 
interrupted program. Next, an 8-bit vector is supplied 
to the 386 SX Microprocessor which identifies the 


appropriate entry in the interrupt table. The table | 


contains the starting address of the interrupt service 
routine. Then, the user supplied interrupt service 
routine is executed. Finally, when an IRET instruc- 
tion is executed the old processor state is restored 
and program execution resumes at the appropriate 
instruction. 


The 8-bit interrupt vector is supplied to the 386 SX 
Microprocessor in several different ways: exceptions 
supply the interrupt vector internally; software INT 
instructions contain or imply the vector; maskable 
hardware interrupts supply the 8-bit vector via the 
interrupt acknowledge bus sequence. Non-Maska- 


ble hardware interrupts are assigned to interrupt 


vector 2. 


Maskable Interrupt 


Maskable interrupts are the most common way to 
respond to asynchronous external hardware events. 
A hardware interrupt occurs when the INTR is pulled 
HIGH and the Interrupt Flag bit (IF) is enabled. The 
processor only responds to interrupts between in- 
structions (string instructions have an ‘interrupt win- 
dow’ between memory moves which allows inter- 
rupts during long string moves). When an interrupt 
occurs the processor reads an 8-bit vector supplied 
by the hardware which identifies the source of the 
interrupt (one of 224 user defined interrupts). 


5-881 


386™ SX MICROPROCESSOR 


Interrupts through interrupt gates automatically reset 
IF, disabling INTR requests. Interrupts through Trap 
Gates leave the state of the IF bit unchanged. Inter- 
rupts through a Task Gate change the IF bit accord- 
ing to the image of the EFLAGs register in the task’s 
Task State Segment (TSS). When an IRET instruc- 
tion is executed, the original state of the IF bit is 
restored. 


Non-Maskable Interrupt 


Non-maskable interrupts provide a method of servic- 
ing very high priority interrupts. When the NMI input 
is pulled HIGH it causes an interrupt with an internal- 
ly supplied vector value of 2. Unlike a normal hard- 
ware interrupt, no interrupt acknowledgment se- 


quence is performed for an NMI. as 7 


While executing the NMI servicing procedure, the 
386 SX Microprocessor will not service any further 
NMI request or INT requests until an interrupt return 
(IRET) instruction is executed or the processor is 
reset. If NMI occurs while currently servicing an NMI, 
its presence will be saved for servicing after execut- 
ing the first IRET instruction. The IF bit is cleared at 
the beginning of an NMI interrupt to inhibit further 
INTR interrupts. 


Software Interrupts 


A third type of interrupt/exception for the 386 SX 
Microprocessor is the software interrupt. An INT n 
instruction causes the processor to execute the in- 
terrupt service routine pointed to by the nth vector in 
the interrupt table. 


A special case of the two byte software interrupt INT 
nis the one byte INT 3, or breakpoint interrupt. By 
inserting this one byte instruction in a program, the 
user can set breakpoints in his program as a debug- 
ging tool. 


A final type of software interrupt is the single step 
interrupt. It is discussed in Single Step Trap. 


intel 


INTERRUPT AND EXCEPTION PRIORITIES — 


Interrupts are externally generated events. Maska- 
bie interrupts (on the iNTR input) and Non-Maskabie 
Interrupts (on the NMI input) are recognized at in- 
struction boundaries. When NMI and maskable 
INTR are both recognized at the same instruction 
boundary, the 386 SX Microprocessor invokes the 
NMI service routine first. If maskable interrupts are 
still enabled after the NMI service routine has been 
invoked, then the 386 SX Microprocessor will invoke 
the appropriate interrupt service routine. 


As the 386 SX Microprocessor executes instruc- 
tions, it follows a consistent cycle in checking for 
exceptions, as shown in Table 2.6. This cycle is re- 
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peated as each instruction is executed, and-occurs 
in parallel with instruction decoding and execution. 


INSTRUCTION RESTART 


The 386 SX Microprocessor fully supports restarting 
all instructions after Faults. If an exception is detect- 
ed in the instruction to be executed (exception cate- 
gories 4 through 10 in Table 2.6), the 386 SX Micro- 
processor invokes the appropriate exception service 
routine. The 386 SX Microprocessor is in a state that 
permits restart of the instruction, for all cases but 
those given in Table 2.7. Note that all such cases 
will be avoided by a properly designed operating 
system. 


Table 2.6. Sequence of Exception Checking | 
Consider the case of the 386™ SX Microprocessor having just completed an instruction. It then performs 
the following checks before reaching the point where the next instruction is completed: 


1. Check for Exception 1 Traps from the instruction just completed (single-step via Trap Flag, or r Data 
Breakpoints set in the Debug Registers). . 


. Check for external NMI and INTR. 


. Check for Exception 1 Faults in the next instruction (Instruction Execution Breakpoint set in the Debug 
Registers for the next instruction). 


. Check for Segmentation Faults that prevented fetching the entire next instruction (exceptions 11 or 13). 
. Check for Page Faults that prevented fetching the entire next instruction (exception 14). 


. Check for Faults decoding the next instruction (exception 6 if illegal opcode; exception 6 if in Real Mode 
or in Virtual 8086 Mode and attempting to execute an instruction for Protected Mode only; or exception 
13 if instruction is monger than 15 bytes, or pavilege violation in Protected Mode (i.e. not at lOPL or at 
CPL=0). 


7. If WAIT opcode, check if TS=1 and MP=1 (exception 7 if both are 1). 
8. If ESCape opcode for numeric coprocessor, check if EM=1 or TS=1 (exception 7 if either are 1). 


9. If WAIT opcode or ESCape opcode for numeric i acta check ERROR # input signal (exception 16 
if ERROR # input is asserted). | 


10. Check in the following order for each memory reference required by the instruction: 


a. Check for Segmentation Faults that prevent transferring the entire emery quantity (exceptions 11, 
12, 13). 


b. Check for ee Faults that prevent transferring the entire mono quantity (exception as 


NOTE: 
Segmentation saab are generated before paging exceptions. 


Table 2.7. Conditions Preventing Instruction Restart 


1. An instruction causes a task switch to a task whose Task State Segment is partially ‘not present’ (An 
entirely ‘not present‘ TSS is restartable). Partially present TSS’s can be avoided either by keeping the 
TSS’s of such tasks present in memory, or by aligning TSS segments to reside entirely within a single 4K 
page (for TSS segments of 4K bytes or less). 


2. A coprocessor operand wraps around the top of a 64K-byte segment or a 4G-byte segment, and spans 
three pages, and the page holding the middle portion of the operand is ‘not present‘. This condition can 
be avoided by starting at a page boundary any segments containing coprocessor operands if the 
segments are approximately 64K-200 bytes or larger (i.e. large enough for wraparound of the coproces- 
sor operand to possibly occur). 


_ Note that these conditions are avoided by using the operating system designs mentioned in this table. 
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Table 2.8. Register Values after Reset 


Flag Word (EFLAGS) 
Machine Status Word (CRO) 
Instruction Pointer (EIP) 
Code Segment (CS) 

Data Segment (DS) 
Stack Segment (SS) 


Extra Segment (ES) 
Extra Segment (FS) 
Extra Segment (GS) 
EAX register 

EDX register 

All other registers 


NOTES: 


uuuu0002H 
uuuuuU 10H 
OOOOFFFOH 
FOOOH 
0000H 
0000H 
0000H 
OO00H 
OOOOH 
OOOOH 


component and stepping ID 


undefined 


1. EFLAG Register. The upper 14 bits of the EFLAGS register are undefined, all defined flag bits are zero. 
2. The Code Segment Register (CS) will have its Base Address set to OFFFFOOOOH and Limit set to OFFFFH. 
3. The Data and Extra Segment Registers (DS, ES) will have their Base Address set to OO0000000H and Limit set to 


OFFFFH. 


4. If self-test is selected, the EAX register should contain a 0 value. If a value of 0 is not found then the self-test has 


detected a flaw in the part. 


5. EDX register always holds component and stepping identifier. 
6. All undefined bits are Intel Reserved and should not be used. 


DOUBLE FAULT 


A Double Fault (exception 8) results when the proc- 


essor attempts to invoke an exception service rou- | 


tine for the segment exceptions (10, 11, 12 or 13), 
but in the process of doing so detects an exception 
other than a Page Fault (exception 14). 


One other cause of generating a Double Fault is the 
386 SX Microprocessor detecting any other excep- 
tion when it is attempting to invoke the Page Fault 
(exception 14) service routine (for example, if a Page 
Fault is detected when the 386 SX Microprocessor 
attempts to invoke the Page Fault service routine). 
Of course, in any functional system, not only in 386 
SX Microprocessor-based systems, the entire page 
fault service routine must remain ‘present’ in memo- 


ry. 


2.8 Reset and Initialization 


When the processor is initialized or Reset the regis- 
ters have the values shown in Table 2.8. The 386 SX 
Microprocessor will then start executing instructions 
near the top of physical memory, at location 
OFFFFFOH. When the first Intersegment Jump or 
Call is executed, address lines Agg-Ao3 will drop 
LOW for CS-relative memory cycles, and the 386 SX 
Microprocessor will only execute instructions in the 
lower one megabyte of physical memory.. This al- 
lows the system designer to use a shadow ROM at 
the top of physical memory to initialize the system 
and take care of Resets. 


RESET forces the 386 SX Microprocessor to termi- 
nate all execution and local bus activity. No instruc- 
tion execution or bus activity will occur as long as 
Reset is active. Between 350 and 450 CLK2 periods 
after Reset becomes inactive, the 386 SX Micro- 
processor will start executing instructions at the to 
of physical memory. : 


2.9 Testability 


The 386 SX Microprocessor, like the 386 Microproc- 
essor, offers testability features which include a self- 
test and direct access to the page translation cache. 


SELF-TEST 


The 386 SX Microprocessor has the capability to 
perform a self-test. The self-test checks the function 
of all of the Control ROM and most of the non-ran- 
dom logic of the part. Approximately one-half of the 
386 SX Microprocessor can be tested during self- 
test. | 


Self-Test is initiated on the 386 SX Microprocessor 
when the RESET pin transitions from HIGH to LOW, 
and the BUSY# pin is LOW. The self-test takes 
about 220 clocks, or approximately 33 milliseconds 
with a 16 MHz 386 SX CPU. At the completion of 
self-test the processor performs reset and begins 
normal operation. The part has successfully passed 
self-test if the contents of the EAX are zero. If the 
results of the EAX are not zero then the self-test has 


detected a flaw in the part. 


5-883 


COMMAND 
WRITABLE 


Ta O70 TR6 


31 


INTEL RESERVED DO NOT USE 


386™ SX MICROPROCESSOR 


TEST 
CONTROL 


1211 10 9° 


TEST 
STATUS 


ee ee YMCA 


240187-7 


Figure 2.6. Test Registers 


TLB TESTING 


The 386 SX Mictoprevessor also provides a mecha- 
nism for testing the: Translation Lookaside Buffer 
(TLB) if desired. This particular mechanism may not 
be continued in the same way in future processors. 


There are two TLB testing operations: 1) writing en- 
tries into the TLB, and, 2) performing TLB lookups. 
Two Test Registers, shown in Figure 2.6, are provid- 
ed for the purpose of testing. TR6 is the ‘test com- 
mand register’, and TR7 is the “test data register”. 
For a more detailed explanation of testing the TLB, 
see the 386™ SX Microprocessor Programmer’s 
Reference Manual. 


2.10 Debugging Support 


The 386 SX Microprocessor provides several fea- 
tures which simplify the debugging process. The 
three categories of on-chip debugging aids are: 


1. The code execution breakpoint opcode (OCCH). 


2. The single-step capability provided by the TF bit 
_in the flag register. 


3. The code and data breakpoint capability ree 
by the Debug Registers DRO-3, DR6, and DR7. 
BREAKPOINT INSTRUCTION. 


A single-byte software interrupt (Int 3) breakpoint in- 
struction is available for use by software debuggers. 


The breakpoint opcode is OCCh, and generates an 


_ exception 3 trap when executed. 


SINGLE-STEP TRAP 


If the single-step flag (TF, bit 8) in the EFLAG regis- 
ter is found to be set at the end of an instruction, a 
single-step exception occurs. The single-step ex- 
ception is auto vectored to exception number 1. 


DEBUG REGISTERS 


The Debug Registers are an : advanced debugging 
feature of the 386 SX Microprocessor. They allow 
data access breakpoints as well as code execution 
breakpoints. Since the breakpoints are indicated by 
on-chip registers, an instruction execution break- 
point can be placed in ROM code or in code shared 
by several tasks, neither of which can be supported 
by the INT 3 breakpoint opcode. 


The 386 SX Microprocessor contains six Debug 
Registers, consisting of four breakpoint address reg- 
isters and two breakpoint control registers. Initially 
after reset, breakpoints are in the disabled state; 
therefore, no breakpoints will occur unless the de- 
bug registers are programmed. Breakpoints set up in 
the Debug Registers are auto-vectored to exception 
1. Figure 2.7 shows the breakpoint status and con- 


trol registers. 
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Figure 2.7. Debug Registers 


3.0 REAL MODE ARCHITECTURE 


When the processor is reset or powered up it is ini- 
tialized in Real Mode. Real Mode has the same base 
architecture as the 8086, but allows access to the 
32-bit register set of the 386 SX Microprocessor. 
The addressing mechanism, memory size, and inter- 
rupt handling are all identical to the Real Mode on 
the 80286. | 


The default operand size in Real Mode is 16 bits, as 
in the 8086. In order to use the 32-bit registers and 
addressing modes, override prefixes must be used. 
In addition, the segment size on the 386 SX Micro- 
processor in Real Mode is 64K bytes so 32-bit ad- 
dresses must have a value less then OOOOFFFFH. 
The primary purpose of Real Mode is to set up the 
processor for Protected Mode operation. 


3.1 Memory Addressing 


In Real Mode the linear addresses are the same as 
physical addresses (paging is not allowed). Physical 
addresses are formed in Real Mode by adding the 
contents of the appropriate segment register which 
is shifted left by four bits to an effective address. 
This addition results in a 20-bit physical address or a 
1 megabyte address space. Since segment registers 
are shifted left by 4 bits, Real Mode segments al- 
ways Start on 16-byte boundaries. 


All segments in Real Mode are exactly 64K bytes 
long, and may be read, written, or executed. The 
386 SX Microprocessor will generate an exception 
13 if a data operand or instruction fetch occurs past 
the end of a segment. 
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Table 3.1. Exceptions in Real Mode 


Function Interrupt Related | Return 
Number | instructions Address Location 
Interrupt table limit INT vector is not Before 
too small , within table limit Instruction 
CS, DS, ES, FS, GS Word memory reference | Before 
Segment overrun exception with offset = OFFFFH. Instruction 


an attempt to execute 
past the end of CS segment. 


Stack Reference 
beyond offset = OFFFFH 


SS Segment overrun 12 
exception | 


3.2 Reserved Locations 


There are two fixed areas in memory which are re- 
served in Real address mode: the system initializa- 
tion area and the interrupt table area. Locations 
OOOOOH through OO3FFH are reserved for interrupt 
vectors. Each one of the 256 possible interrupts has 
a 4-byte jump vector reserved for it. Locations 


‘OFFFFFOH through OFFFFFFH are reserved for sys- 


tem initialization. 3 | 


3.3 Interrupts | 
Many of the exceptions discussed in section 2.7 are 
not applicable to Real Mode operation; in particular, 
exceptions 10, 11 and 14 do not occur in Real 
Mode. Other exceptions have slightly different 
meanings in Real Mode; Table 3.1 identifies these 
exceptions. | 


3.4 Shutdown and Halt 


The HLT instruction stops program execution and 
prevents the processor from using the local bus until 
restarted. Either NMI, FLT#, INTR with interrupts 
enabled (IF = 1), or RESET will force the 386 SX Mi- 
croprocessor out of halt. If interrupted, the saved 
CS:IP will point to the next instruction after the HLT. 


Shutdown will occur when a severe error is detected 
that prevents further processing. In Real Mode, 
shutdown can occur under two conditions: 


1. An interrupt or an exception occurs (Exceptions 8 
or 13) and the interrupt vector is larger than the 
Interrupt Descriptor Table... — 


2.A CALL, INT or PUSH instruction attempts to 
wrap around the stack segment when SP is not 
even. 


An NMI input can bring the processor out of shut- 
down if the Interrupt Descriptor Table limit is large 
enough to contain the NMI interrupt vector (at least 


Before 
Instruction 
OOOFH) and the stack has enough room to contain 


the vector and flag information (i.e. SP is greater that 
0005H). Otherwise, shutdown can only be exited by 


a processor reset. 


3.5 LOCK operation 


The LOCK prefix on the 386 SX Microprocessor, | 
even in Real Mode, is more restrictive than on the 
80286. This is due to the addition of paging on the 
386 SX Microprocessor in Protected Mode and Vir- 
tual 8086 Mode. The LOCK prefix is not supported 
during repeat string instructions. 


The only instruction forms where the LOCK prefix is 
legal on the 386 SX Microprocessor are shown in 
Table 3.2. ee 


Table 3.2. Legal Instructions for the LOCK Prefix 


Operands 


BIT Testand _ 
SET/RESET 
/COMPLEMENT | 


AND, SUB, XOR | Mem, Reg/Immediate 
PNOT.NEG,ING.DEC | Mem 


An exception 6 will be generated if a LOCK prefix is 
placed before any instruction form or opcode not 
listed above. The LOCK prefix allows indivisible 
read/modify/write operations on memory operands 
using the instructions above. 


Mem, Reg/Immediate 


The LOCK prefix is not IOPL-sensitive on the 386 SX 
Microprocessor. The LOCK prefix can be used at 
any privilege level, but only on the instruction forms 
listed in Table 3.2. 
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4.0 PROTECTED MODE 
ARCHITECTURE 


The complete capabilities of the 386 SX Microproc- 
essor are unlocked when the processor operates in 
Protected Virtual Address Mode (Protected Mode). 
Protected Mode vastly increases the linear address 
space to four gigabytes (232 bytes) and allows the 
running of virtual memory programs of almost unlim- 
ited size (64 terabytes (246 bytes)). In addition, Pro- 
tected Mode allows the 386 SX Microprocessor to 
run all of the existing 386 DX CPU (using only 16 
megabytes of physical memory), 80286 and 8086 
CPU’s software, while providing a sophisticated 
memory management and a hardware-assisted pro- 
tection mechanism. Protected Mode allows the use 
of additional instructions specially optimized for sup- 
porting multitasking operating systems. The base ar- 
chitecture of the 386 SX Microprocessor remains 
the same; the registers, instructions, and addressing 
modes described in the previous sections are re- 
tained. The main difference between Protected 
Mode and Real Mode from a programmer’s view- 
point is the increased address space and a different 
addressing mechanism. 


4.1 Addressing Mechanism 


Like Real Mode, Protected Mode uses two compo- 
nents to form the logical address; a 16-bit selector is 
used to determine the linear base address of a seg- 
ment, the base address is added to a 32-bit effective 
address to form a 32-bit linear address. The linear 


address is then either used as a 24-bit physical ad- . 


dress, or if paging is enabled the paging mechanism 
maps the 32-bit linear address into a 24-bit physical 
address. | 


The difference between the two modes lies in calcu- 
lating the base address. In Protected Mode, the se- 
lector is used to specify an index into an operating 
system defined table (see Figure 4.1). The table 
contains the 32-bit base address of a given seg- 
ment. The physical address is formed by adding the 
base address obtained from the table to the offset. 


Paging provides an additional memory management 
mechanism which operates only in Protected Mode. 
Paging provides a means of managing the very large 
segments of the 386 SX Microprocessor, as paging 
operates beneath segmentation. The page mecha- 
nism translates the protected linear address which 
comes from the segmentation unit into a physical 
address. Figure 4.2 shows the complete 386 SX Mi- 
croprocessor addressing mechanism with paging 
enabled. 
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4.2 Segmentation 


Segmentation is one method of memory manage- 
ment. Segmentation provides the basis for protec- 
tion. Segments are used to encapsulate regions of 
memory which have common attributes. For exam- 
ple, all of the code of a given program could be con- 
tained in a segment, or an operating system table 
may reside in a segment. All information about each 
segment is stored in an 8 byte data structure called 
a descriptor. All of the descriptors in a system are 
contained in descriptor tables which are recognized 
by hardware. 


TERMINOLOGY 


The following terms are used throughout the discus- 
sion of descriptors, privilege levels and protection: 


PL: Privilege Level—One of the four hierarchical 
privilege levels. Level 0 is the most privileged 
level and level 3 is the least privileged. 


RPL: Requestor Privilege Level—The privilege level 
of the original supplier of the selector. RPL is 
determined by the least two significant bits of 
a selector. 


DPL: Descriptor Privilege Level—This is the least 
privileged level at which a task may access 
that descriptor (and the segment associated 
with that descriptor). Descriptor Privilege Lev- 
el is determined by bits 6:5 in the Access 
Right Byte of a descriptor. 


CPL: Current Privilege Level—The privilege level at 
which a task is currently executing, which 
equals the privilege level of the code segment 
being executed. CPL can also be determined 
by examining the lowest 2 bits of the CS regis- 
ter, except for conforming code segments. 


EPL: Effective Privilege Level—The effective privi- 
lege level is the least privileged of the RPL 
and the DPL. EPL is the numerical maximum 
of RPL and DPL. 


Task: One instance of the execution of a program. 
Tasks are also referred to as processes. 


DESCRIPTOR TABLES 


The descriptor tables define all of the segments 
which are used in a 386 SX Microprocessor system. 
There are three types of tables which hold descrip- 
tors: the Global Descriptor Table, Local Descriptor 
Table, and the Interrupt Descriptor Table. All of the 
tables are variable length memory arrays and can 
vary in size from 8 bytes to 64K bytes. Each table 


~ can hold up to 8192 8-byte descriptors. The upper 


13 bits of a selector are used as an index into the 
descriptor table. The tables have registers associat- 
ed with them which hold the 32-bit linear base ad- 
dress and the 16-bit limit of each table. 
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Figure 4.3. Descriptor Table Registers 
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Each of the tables has a register associated with it: 
GDTR, LDTR, and IDTR; see Figure 2.1. The LGDT, 
LLDT, and LIDT instructions load the base and limit 
of the Global, Local, and Interrupt Descriptor Tables 
into the appropriate register. The SGDT, SLDT, and 
SIDT store the base and limit values. These are priv- 
ileged instructions. 


Global Descriptor Table 


The Global Descriptor Table (GDT) contains de- 
scriptors which are available to all of the tasks in a 
system. The GDT can contain any type of segment 
descriptor except for interrupt and trap descriptors. 
Every 386 SX CPU system contains a GDT. 


The first slot of the Global Descriptor Table corre- 
sponds to the null selector and is not used. The null 
selector defines a null pointer value. 


Local Descriptor Table 


LDTs contain descriptors which are associated with 
a given task. Generally, operating systems are de- 
signed so that each task has a separate LDT. The 
LDT may contain only code, data, stack, task gate, 
and call gate descriptors. LDTs provide a mecha- 
nism for isolating a given task’s code and data seg- 
ments from the rest of the operating system, while 
the GDT contains descriptors for segments which 
are common to all tasks. A segment cannot be ac- 
cessed by a task if its segment descriptor does not 
exist in either the current LDT or the GDT. This pro- 
vides both isolation and protection for a task’s seg- 
ments while still allowing global data to be shared 
among tasks. 


SEGMENT BASE 15. 


Base Address of the segment 

The length of the segment 

Present Bit 1=Present O=Not Present 
Descriptor Privilege Level 0-3 © 
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Unlike the 6-byte GDT or IDT registers which contain 
a base address and limit, the visible portion of the 
LDT register contains only a 16-bit selector. This se- 
lector refers to a Local Descriptor Table descriptor in 
the GDT (see figure 2.1). 


Interrupt Descriptor Table 


The third table needed for 386 SX Microprocessor 
systems is the Interrupt Descriptor Table. The IDT 
contains the descriptors which point to the location 
of the up to 256 interrupt service routines. The IDT 
may contain only task gates, interrupt gates, and 
trap gates. The IDT should be at least 256 bytes in 
size in order to hold the descriptors for the 32 Intel 
Reserved Interrupts. Every interrupt used by a sys- 
tem must have an entry in the IDT. The IDT entries 
are referenced by INT instructions, external interrupt 
vectors, and exceptions. 


DESCRIPTORS 


The object to which the segment selector points to 
is called a descriptor. Descriptors are eight byte 
quantities which contain attributes about a given re- 
gion of linear address space. These attributes in- 
clude the 32-bit base linear address of the segment, 
the 20-bit length and granularity of the segment, the 
protection level, read, write or execute privileges, 
the default size of the operands (16-bit or 32-bit), 
and the type of segment. All of the attribute informa- 
tion about a segment is contained in 12 bits in the 
segment descriptor. Figure 4.4 shows the general 
format of a descriptor. All segments on the 386 SX 
Microprocessor have three attribute fields in com- 
mon: the P bit, the DPL bit, and the S bit. The P 


BYTE 
ADDRESS 


0 


SEGMENT LIMIT 15. 


Segment Descriptor O=System Descriptor 1 =Code or Data Segment Descriptor 


Type of Segment 
Accessed Bit 
Granularity Bit 1=Segment length is page granular 


0 = Segment length is byte granular 


Default Operation Size (recognized in code segment descriptors only) 1=32-bit segment 0= 16-bit segment 


Bit must be zero (0) for compatibility with future processors 


Available field for user or OS 


Figure 4.4. Segment Descriptors 
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(Present) Bit is 1 if the segment is loaded in physical 
memory. If. P=0 then any attempt to access this 


segment causes a not present-exception (exception | 


11}. The Descriptor Privilege Level, DPL, is a two bit 
field which specifies the protection level, oF 3, ASSO- 
ciated with a segment. 


The 386 SX Microprocessor has two main catego- 
ries of segments: system segments and non-system 
segments (for code and data). The segment bit, S, 
determines if a given segment is a system segment 


SEGMENT BASE 15...0 


BASE 31... 24 


D/B  1=Default Instructions Attributes are 32-Bits 
0= Default Instruction Attributes are 16-Bits 
AVL Available field for user or OS 


386™ SX MICROPROCESSOR 


or a code or data segment. If the S bit is 1 then the 
segment is either a code or data segment; if it is 0 
then the segment is a system segment. 

Code and Data Descriptors (S=1) — 


Figure 4.5 shows the general format of a code and 


- data descriptor and Table 4.1 illustrates how the bits 


in the Access Right Byte are interpreted. 


ACCESS 
RIGHTS 
BYTE 


Granularity Bit 1=Segment length is page granular 
0= Segment length is byte granular 
Bit must be zero (0) for compatibility with future processors 


Figure 4.5. Code and Data Descriptors 


_ Bit | 
| Position 
Present (P) — 


Table 4.1. Access Rights Byte Definition for Code and Data Descriptors 


Function 


Segment is mapped into physical memory. 


No mapping to physical memonye exists, base and limt are 


- not used. 
| Segment privilege attribute used in privilege tests. 


Descriptor Privilege 
Level (DPL) | 
Segment Descrip- 
tor (S) 


Executable (E) 
Expansion Direc- 
tion (ED) 
Writeable (W) 


Code or Data (includes stacks) segment descriptor 
_ System Segment Descriptor or Gate Descriptor 


Descriptor type is data segment: If 
O Expand up segment, offsets must be < limit. Data 
= 1 Expand down segment, offsets must be > limit. 
Vv = 0 Data segment may not be written into. 


Segment 
(S = 1, 


Data segment may be written into. E = 0) 


Executable (E) 
Conforming (C) 


Readable (R) 


Descriptor type is code segment: sy If 
Code segment may only be executed 
when CPL = DPL and CPL 

remains unchanged. 

Code segment may not be read. 


Code 
Segment 
(S = 1, 
E = 1) 


Code segment may be read. 


Accessed (A) 


Segment has not been accessed. 


Segment selector has been loaded into segment register : 
or used by selector test instructions. 
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SEGMENT BASE 15. SEGMENT LIMIT 15. 
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Figure 4.6. System Descriptors 


Code and data segments have several descriptor 
fields in common. The accessed bit, A, is set when- 
ever the processor accesses a descriptor. The gran- 
ularity bit, G, specifies if a segment length is byte- 
granular or page-granular. 


System Descriptor Formats (S= 0) 


System segments describe information about oper- 
ating system tables, tasks, and gates. Figure 4.6 
shows the general format of system segment de- 
_ scriptors, and the various types of system segments. 
386 SX system descriptors (which are the same as 
386 DX CPU system descriptors) contain a 32-bit 
base linear address and a 20-bit segment limit. 
80286 system descriptors have a 24-bit base ad- 
dress and a 16-bit segment limit. 80286 system de- 
scriptors are Gentniee by the upper 16 bits being all 
zero. 


Differences Between 386™ SX Microprocessor 
and 80286 Descriptors | 


In order to provide operating system compatibility 
with the 80286 the 386 SX CPU supports all of the 
80286 segment descriptors. The 80286 system seg- 
ment descriptors contain a 24-bit base address and 
16-bit limit, while the 386 SX CPU system segment 
descriptors have a 32-bit base address, a 20-bit limit 
_ field, and a granularity bit. The word count field 
specifies the number of 16-bit quantities to copy for 
80286 call gates and 32-bit quantities for 386 SX 
CPU call gates. 
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Selector Fields 


A selector in Protected Mode has three fields: Local 
or Global Descriptor Table indicator (Tl), Descriptor 
Entry Index (Index), and Requestor (the selector’s) 
Privilege Level (RPL) as shown in Figure 4.7. The TI 
bit selects either the Global Descriptor Table or the 
Local Descriptor Table. The Index selects one of 8k 
descriptors in the appropriate descriptor table. The 
RPL bits allow high speed testing of the selector’s 
privilege attributes. 


Segment Descriptor Cache 


In addition to the selector value, every segment reg- 
ister has a segment descriptor cache register asso- 
ciated with it. Whenever a segment register’s con- 
tents are changed, the 8-byte descriptor associated 
with that selector is automatically loaded (cached) 
on the chip. Once loaded, all references to that seg- | 
ment use the cached descriptor information instead 
of reaccessing the descriptor. The contents of the 
descriptor cache are not visible to the programmer. 
Since descriptor caches only change when a seg- 
ment register is changed, programs which modify 
the descriptor tables must reload the appropriate 
segment registers after changing a descriptor’s val- 
ue. 
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Figure 4.7. Example Descriptor Selection 


4.3 Protection 


"The 386 SX Microprocessor has four levels of pro- 
tection which are optimized to support a multi-task- 
ing operating system and to isolate and protect user 
_ programs from each other and the operating system. 


The privilege levels control the use of privileged in- . 


structions, I/O instructions, and access to segments 
~ and segment descriptors. The 386 SX Microproces- 


sor also offers an additional type of protection on a | 


page basis when paging is enabled. » 


The four-level hierarchical privilege system is an ex- 
tension of the user/supervisor privilege mode com- 
monly used by minicomputers. The user/supervisor 
mode is fully supported by the 386 SX Microproces- 
sor paging mechanism. The privilege levels (PL) are 
numbered 0 through 3. Level 0 is the most privileged 
level. 


RULES OF PRIVILEGE 


The 386 SX Microprocessor controls access to both 
data and procedures between levels of a task, ac- 
cording to the following rules. 


— Data stored in a segment with privilege level p 
can be accessed only by code executing at a 
privilege level at least as privileged as p. 


_— A code segment/procedure with privilege level p 


can only be called by a task executing at the 
same or a lesser privilege level than p. 


PRIVILEGE LEVELS 


At any point in time, a task on the 386 SX Microproc- 
essor always executes at one of the four privilege 
levels. The Current Privilege Level (CPL) specifies 
what the task’s privilege level is. A task’s CPL may 
only be changed by control transfers through gate 
descriptors to a code segment with a different privi- 
lege level. Thus, an application program running at 
PL=3 may call an operating system routine at 
PL= 1 (via a gate) which would cause the task’s CPL 
to be set to 1 until the operating system routine was 
finished. 


Selector Privilege (RPL) | 
The privilege level of a selector is specified by the 


RPL field. The selector’s RPL is only used to estab- 
lish a less trusted privilege level than the current 


privilege level of the task for the use of a segment. 


This level is called the task’s effective privilege level 
(EPL). The EPL is defined as being the least privi- 
leged (numerically larger) level of a task’s CPL anda 
selector’s RPL. The RPL is most commonly used to 
verify that pointers passed to an operating system 
procedure do not access data that is of higher privi- 
lege than the procedure that originated the pointer. 
Since the originator of a selector can specify any 
RPL value, the Adjust RPL (ARPL) instruction is pro- 
vided to force the RPL bits to the originator’s CPL. 
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Table 4.2. Descriptor Types Used for Control Transfer 


: Descriptor Descriptor 
Control Transfer Types Operation Types Balacencca Table 


Intersegment within the same privilege level JMP, CALL RET, IRET* | Code Segment | GDT/LDT 
Intersegment to the same or higher privilege level | CALL Call Gate GDT/LDT 


Interrupt within task may change CPL 


Interrupt instruction Trap or ID 
Exception External Interrupt 
Interrupt Gate 


Intersegment to a lower privilege level RET, IRET* Code Segment GDT/LDT 
(changes task CPL) | 


G 
Task Switch CALL, JMP GDT/LDT 


*NT (Nested Task bit of flag register) = 0 
**NT (Nested Task bit of flag register) = 1 


I/O Privilege 
The I/O privilege level (IOPL) lets the operating sys- 


tem code executing at CPL=0 define the least privi- . 


leged level at which I/O instructions can be used. An 
exception 13 (General Protection Violation) is gener- 


ated if an I/O instruction is attempted when the CPL - 


of the task is less privileged then the IOPL. The 
IOPL is stored in bits 13 and 14 of the EFLAGS reg- 
ister. The following instructions cause an exception 
13 if the CPL is greater than IOPL: IN, INS, OUT, 
OUTS, STI, CLI, LOCK prefix. 


Descriptor Access 


There are basically two types of segment accesses: 
those involving code segments such as control 
transfers, and those involving data accesses. Deter- 
mining the ability of a task to access a segment in- 
volves the type of segment to be accessed, the in- 


struction used, the type of descriptor used and CPL, 


RPL, and DPL as described above. 


Any time an instruction loads a data segment regis- 
ter (DS, ES, FS, GS) the 386 SX Microprocessor 
makes protection validation checks. Selectors load- 
ed in the DS, ES, FS, GS registers must refer only to 
data segment or readable code segments. 


= 
CALL, JMP Task State DT 
Segment 
DT 


IRET** Task Gate | 
Interrupt instruction, 

Exception, External 

Interrupt 


Finally the privilege validation checks are performed. 
The CPL is compared to the EPL and if the EPL is 
more privileged than the CPL, an exception 13 (gen- 
eral protection fault) is generated. 


The rules regarding the stack segment are slightly 
different than those involving data segments. In- 
structions that load selectors into SS must refer to 
data segment descriptors for writeable data seg- 
ments. The DPL and RPL must equal the CPL of all 
other descriptor types or a privilege level violation 
will cause an exception 13. A stack not present fault 
causes an exception 12. 


PRIVILEGE LEVEL TRANSFERS 


Inter-segment control transfers occur when a selec- 
tor is loaded in the CS register. For a typical system 
most of these transfers are simply the result of a call 
or a jump to another routine. There are five types of 
control transfers which are summarized in Table 4.2. 
Many of these transfers result in a privilege level - 
transfer. Changing privilege levels is done only by 
control transfers, using gates, task switches, and in- 
terrupt or trap gates. : 


Control transfers can only occur if the operation 
which loaded the selector references the correct de- 
scriptor type. Any violation of these descriptor usage 
rules will cause an exception 13. 
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Figure 4.8. 386™ SX Microprocessor TSS and TSS Registers 
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Figure 4.9. Sample !/O Permission Bit Map 


CALL GATES 


Gates provide protected indirect CALLs. One of the 
major uses of gates is to provide a secure method of 
privilege transfers within a task. Since the operating 
system defines all of the gates in a system, it can 
ensure that all gates only allow entry into a few trust- 
ed procedures. 


TASK SWITCHING 


A very important attribute of any multi-tasking/multi- 
user operating system is its ability to rapidly switch 
between tasks or processes. The 386 SX Microproc- 
essor directly supports this operation by providing a 
task switch instruction in hardware. The task switch 
operation saves the entire state of the machine (all 
of the registers, address space, and a link to the 
previous task), loads a new execution state, per- 
forms protection checks, and commences execution 
in the new task. Like transfer of control by gates, the 
task switch operation is invoked by executing an in- 
ter-segment JMP or CALL instruction which refers to 
a Task State Segment (TSS), or a task gate descrip- 
tor in the GDT or LDT. An INT n instruction, excep- 
tion, trap, or external interrupt may also invoke the 
task switch operation if there is a task gate descrip- 
tor in the associated IDT descriptor slot. 


The TSS descriptor points to a segment (see Figure 


4.8) containing the entire execution state. A task 
gate descriptor contains a TSS selector. The 386 SX 
Microprocessor supports both 286 and 386 SX CPU 
TSSs. The limit of a 386 SX Microprocessor TSS 
must be greater than 64H (2BH for a 286 TSS), and 
can be as large as 16 megabytes. In the additional 
TSS space, the operating system is free to store ad- 
ditional information such as the reason the task is 
inactive, time the task has spent running, or open 
files belonging to the task. 


Each task must have a TSS associated with it. The 
current TSS is identified by a special register in the 
386 SX Microprocessor called the Task State Seg- 


ment Register (TR). This register contains a selector 


referring to the task state segment descriptor that 
defines the current TSS. A hidden base and limit 
register associated with TSS descriptor are loaded 
whenever TR is loaded with a new selector. Return- 
ing from a task is accomplished by the IRET instruc- 
tion. When IRET is executed, control is returned to 


the task which was interrupted. The currently exe- 
cuting task’s state is saved in the TSS and the old 
task state is restored from its TSS. 


Several bits in the flag register and machine status 
word (CRO) give information about the state of a 
task which is useful to the operating system. The 
Nested Task bit, NT, controls the function of the 
IRET instruction. lf NT=0O the IRET instruction per- 
forms the regular return. If NT=1 IRET performs a 
task switch operation back to the previous task. The 
NT bit is set or reset in the following fashion: 


When a CALL or INT instruction initiates a task 
switch, the new TSS will be marked busy and 
the back link field of the new TSS set to the old 
TSS selector. The NT bit of the new task is set 
by CALL or INT initiated task switches. An in- 
terrupt that does not cause a task switch will 
clear NT (The NT bit will be restored after exe- 
cution of the interrupt handler). NT may also be 
set or cleared by POPF or IRET instructions. 


The 386 SX Microprocessor task state segment is 
marked busy by changing the descriptor type field 
from TYPE 9 to TYPE OBH. A 286 TSS is marked 
busy by changing the descriptor type field from 
TYPE 1 to TYPE 3. Use of a selector that references 
a busy task state segment causes an exception 13. 


The VM (Virtual Mode) bit is used to indicate if a task 
is a Virtual 8086 task. If VM=1 then the tasks will 
use the Real Mode addressing mechanism. The vir- 
tual 8086 environment is only entered and exited by 
a task switch. | 


The coprocessor’s state is not automatically saved 
when a task switch occurs. The Task Switched Bit, 
TS, in the CRO register helps deal with the coproces- 
sor’s state in a multi-tasking environment. Whenever 
the 386 SX Microprocessor switches task, it sets the 
TS bit. The 386 SX Microprocessor detects the first 
use of a processor extension instruction after a task 
switch and causes the processor extension not 
available exception 7. The exception handler for ex- 
ception 7 may then decide whether to save the state 
of the coprocessor. 


The T bit in the 386 SX Microprocessor TSS indi- 
cates that the processor should generate a debug 
exception when switching to a task. If T=1 then 
upon entry to a new task a debug exception 1 will be 
generated. 
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INITIALIZATION AND TRANSITION TO 
PROTECTED MODE 


Since th the 386 SX N icropr ove essor begins executing 
in Real Mode immediately after RESET it is neces- 
sary to initialize the system tables and registers with 
the appropriate values. The GDT and IDT registers 
must refer to a valid GDT and IDT. The IDT should 
be at least 256 bytes long, and the GDT must con- 
tain descriptors for the initial code ane data seg- 
ments. 


Protected Mode is enabled by loading CRO with PE 
bit set. This can be accomplished by using the MOV 
CRO, R/M instruction. After enabling Protected 
Mode, the next instruction should execute an inter- 
segment JMP to load the CS register and flush the 
instruction decode queue. The final step is to load all 
of the data segment registers with the initial selector 
values. 


An alternate approach to entering Protected Mode is 
to use the built in task-switch to load all of the regis- 
ters. In this case the GDT would contain two TSS 
descriptors in addition to the code and data descrip- 
tors needed for the first task. The first JMP instruc- 
tion in Protected Mode would jump to the TSS caus- 
ing a task switch and loading ail of the registers with 
the values stored in the TSS. The Task State Seg- 
ment Register should be initialized to peo to a valid 
TSS descriptor. 
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—«4AG Paging 


Paging is another type of memory management use- 
fui for virtuai memory muiti-tasking operating sys- 
tems. Unlike segmentation, which modularizes pro- 
grams and data into variable length segments, pag- 
ing divides programs into multiple uniform size 
pages. Pages bear no direct relation to the logical 
structure of a program. While segment selectors can 
be considered the logical ‘name‘ of a program mod- 
ule or data structure, a page most likely corresponds 
to only a portion of a module or data structure. 


PAGE ORGANIZATION 


The 386 SX Microprocessor uses two levels of ta- 
bles to translate the linear address (from the seg- 
mentation unit) into a physical address. There are 
three components to the paging mechanism of the 
386 SX Microprocessor: the page directory, the 
page tables, and the page itself (page frame). All 
memory-resident elements of the 386 SX Microproc- 
essor paging mechanism are the same size, namely 
4K bytes. A uniform size for ali of the elements sim- 
plifies memory allocation and reallocation schemes, 
since there is no problem with memory fragmenta- 
tion. Figure 4.10 shows how the paging mechanism | 
works. 


— TWO LEVEL “L PAGING SCHEME 


LINEAR 
ADDRESS 


DIRECTORY 
CONTROL REGISTERS 


4@ 


USER 
MEMORY 


a | aie 


On es 


PAGE TABLE 


240187-15 


Figure 4.10. Paging Mechanism 


System 
Software 
Defineable 


PAGE TABLE ADDRESS 31..12 


Figure 4.11. Page Directory Entry (Points to Page Table) 
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Figure 4.12. Page Table Entry (Points to Page) 


Page Fault Register 


CR2 is the Page Fault Linear Address register. It 
holds the 32-bit linear address which caused the last 
Page Fault detected. 


Page Descriptor Base Register 


CR3 is the Page Directory Physical Base Address 
Register. It contains the physical starting address of 
the Page Directory (this value is truncated to a 24-bit 
value associated with the 386 SX CPU’s 16 mega- 
byte physical memory limitation). The lower 12 bits 
of CR3 are always zero to ensure that the Page Di- 
rectory is always page aligned. Loading it with a 
MOV CR3, reg instruction causes the page table en- 
try cache to be flushed, as will a task switch through 
a TSS which changes the value of CRO. 


Page Directory 


The Page Directory is 4k bytes long and allows up to 
1024 page directory entries. Each page directory en- 
try contains information about the page table and 
the address of the next level of tables, the Page 
Tables. The contents of a Page Directory Entry are 
shown in figure 4.11. The upper 10 bits of the linear 
address (A3;—Ao2) are used as an index to select 
the correct Page Directory Entry. 


The page table address contains the upper 20 bits 
of a 32-bit physical address that is used as the base 
address for the next set of tables, the page tables. 
The lower 12 bits of the page table address are zero 
so that the page table addresses appear on 4 kbyte 
boundaries. For a 386 DX CPU system the upper 20 
bits will select one of 220 page tables, but for a 
386 SX Microprocessor system the upper 20 bits 
only select one of 212 page tables. Again, this is 
because the 386 SX Microprocessor is limited to a 
24-bit physical address and the upper 8 bits (Ao4- 
A313) are truncated when the address is output on its 
24 address pins. | 


Page Tables 


Each Page Table is 4K bytes long and allows up to 
1024 Page table Entries. Each page table entry con- 


tains information about the Page Frame and its ad- 


dress. The contents of a Page Table Entry are 
shown in figure 4.12. The middle 10 bits of the linear 
address (Ao1—Aj42) are used as an index to select 
the correct Page Table Entry. | 


The Page Frame Address contains the upper 20 bits 
of a 32-bit physical address that is used as the base 
address for the Page Frame. The lower 12 bits of the 
Page Frame Address are zero so that the Page 
Frame addresses appear on 4 kbyte boundaries. For 
an 386 DX CPU system the upper 20 bits will select 
one of 220 Page Frames, but for an 386 SX Micro- 
processor system the upper 20 bits only select one 
of 212 Page Frames. Again, this is because the 
386 SX Microprocessor is limited to a 24-bit physical 
address space and the upper 8 bits (Ao4—A31) are 
truncated when the address is output on its 24 ad- 
dress pins. | 


Page Directory/Table Entries 


The lower 12 bits of the Page Table Entries and 
Page Directory Entries contain statistical information 
about pages and page tables respectively. The P 
(Present) bit indicates if a Page Directory or Page 
Table entry can be used in address translation. If 
P= 1, the entry can be used for address translation. 
lf P=0, the entry cannot be used for translation. All 
of the other bits are available for use by the soft- 
ware. For example, the remaining 31 bits could be 
used to indicate where on disk the page is stored. 


The A (Accessed) bit is set by the 386 SX CPU for 
both types of entries before a read or write access 
occurs to an address covered by the entry. The D 
(Dirty) bit is set to 1 before a write to an address 
covered by that page table entry occurs. The D bit is 
undefined for Page Directory Entries. When the P, A 
and D bits are updated by the 386 SX CPU, the proc- 
essor generates a Read- Modify-Write cycle which | 
locks the bus and prevents conflicts with other proc- 
essors or peripherals. Software which modifies 
these bits should use the LOCK prefix to ensure the 
integrity of the page tables in multi-master systems. 


The 3 bits marked system software definable in Fig- 
ures 4.11 and Figure 4.12 are software definable. 
System software writers are free to use these bits 
for whatever purpose they wish. 
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PAGE LEVEL PROTECTION (R/W, U/S BITS) 


The 386 SX Microprocessor provides a set of pro- 


tection attributes for paging systems. The paging 
mechanism distinguishes between two levels of pro- 
tection: User, which corresponds to level 3 of the 
segmentation based protection, and supervisor 
which encompasses all of the other protection levels 
(0, 1, 2). Programs executing at Level 0, 1 or 2 by- 
pass the page protection, although segmentation- 
based protection is still enforced by the hardware. 


The U/S and R/W bits are used to provide User/Su- 
pervisor and Read/Write protection for individual 
pages or for all pages covered by a Page Table Di- 
rectory Entry. The U/S and R/W bits in the second 
level Page Table Entry apply only to the page de- 
scribed by that entry. While the U/S and R/W bits in 
the first level Page Directory Table apply to all pages 
described by the page table pointed to by that direc- 
tory entry. The U/S and_.R/W bits for a given page 
are obtained by taking the most restrictive of the U/ 


S and R/W from the Page Directory Table Entries © 


and using these bits to address the page. 


TRANSLATION LOOKASIDE BUFFER 


The 386 SX Microprocessor paging hardware is de- 
signed to support demand paged virtual memory 
systems. However, performance would degrade 
substantially if the processor was required to access 
two levels of tables for every memory reference. To 
solve this problem, the 386 SX Microprocessor 
keeps a cache of the most recently accessed pages, 
this cache is called the Translation Lookaside Buffer 
(TLB). The TLB is a four-way set associative 32-en- 
try page table cache. It automatically keeps the most 


commonly used page table entries in the processor. . 


The 32-entry TLB coupled with a 4K page size re- 
sults in coverage of 128K bytes of memory address- 
_ es. For many common multi-tasking systems, the 
TLB will have a hit rate of greater than 98%. This 
means that the processor will.only have to access 
the two-level page structure for less than 2% of all 
memory references. | 


PAGING OPERATION 


The paging hardware operates in the following fash- 
ion. The paging unit hardware receives a 32-bit lin- 
ear address from the segmentation unit. The upper 
20 linear address bits are compared with all 32 en- 
tries in the TLB to determine if there is a match. If 
there is a match (i.e. a TLB hit), then the 24-bit phys- 
ical address is calculated and is placed on the ad- 
dress bus. 


If the page table entry is not in the TLB, the 386 SX 
Microprocessor will read the appropriate Page Direc- 
tory Entry. If P=1 on the Page Directory Entry, indi- 
cating that the page table is in memory, then the 386 
SX Microprocessor will read the appropriate 
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Page Table Entry and set the Access bit. If P=1 on 
the Page Table Entry, indicating that the page is in 
memory, the 386 SX Microprocessor will update the 


- Access and Dirty bits as needed and fetch the oner- 


and. The upper 20 bits of the linear address, read 


from the: page table, will be stored in the TLB for | 


future accesses. If P=0 for either the Page Directo- 
ry Entry or the Page Table Entry, then the processor — 
will generate a page fault Exception 14. 


The processor will also generate a Page Fault (Ex- 
ception 14) if the memory reference violated the 
page protection attributes. CR2 will hold the linear 
address which caused the page fault. Since Excep- 
tion 14 is classified as a fault, CS:EIP will point to the 
instruction causing the page-fault. The 16-bit error 
code pushed as part of the page fault handler will 
contain status bits which indicate the cause of the 
page fault. 


The 16-bit error code is used by the operating sys- 
tem to determine how to handle the Page Fault. Fig- 
ure 4.13 shows the format of the Page Fault error 
code and the interpretation of the bits. Even though 
the bits in the error code (U/S, W/R, and P) have 
similar names as the bits in the Page Directory/Ta- 
ble Entries, the interpretation of the error code bits is 
different. Figure 4.14 indicates what type of é access 
caused the page fault. 


3210 
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Figure 4.13. Page Fault Error Code Format 


U/S: The U/S bit indicates whether the access 
causing the fault occurred when the processor was | 
executing in User Mode (U/S = 1) or in Supervisor 
mode (U/S = 0) 


W/R: The W/R bit indicates whether the access 
causing the fault was a Read (W/ R = 0) or a Write 
(W/R = 1) . 


P: The P bit indicates whether a page fault was 


caused by a not-present page (P = 0), or by a page 
level protection violation (P = 1) 


Access Type 


Superviser* Read 


U = Undefined 


Supervisor Write 
User Read 
User Write 


*Descriptor table access will fault with U/S = 0, even if 
the program is executing at level 3. 


Figure 4.14. Type of Access Causing Page Fauit 
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OPERATING SYSTEM RESPONSIBILITIES 


When the operating system enters or exits paging 
mode (by setting or resetting bit 31 in the CRO regis- 
ter) a short JMP must be executed to flush the 
386 SX Microprocessor’s prefetch queue. This en- 
sures that all instructions executed after the address 
mode change will generate correct addresses. 


The 386 SX Microprocessor takes care of the page 
address translation process, relieving the burden 
from an operating system in a demand-paged sys- 
tem. The operating system is responsible for setting 
up the initial page tables and handling any page 
faults. The operating system also is required to inval- 
idate (i.e. flush) the TLB when any changes are 
made to any of the page table entries. The operating 
system must reload CR3 to cause the TLB to be 
flushed. 


Setting up the tables is simply a matter of loading 
CR3 with the address of the Page Directory, and 
allocating space for the Page Directory and the 
Page Tables. The primary responsibility of the oper- 
ating system is to implement a swapping policy and 
handle all of the page faults. 


A final concern of the operating system is to ensure 
that the TLB cache matches the information in the 
paging tables. In particular, any time the operating 
systems sets the P (Present) bit of page table entry 
to zero. The TLB must be flushed by reloading CRS. 
Operating systems may want to take advantage of 
the fact that CR3 is stored as part of a TSS, to give 
every task or group of tasks its own set of page 
tables. 


4.5 Virtual 8086 Environment 


The 386 SX Microprocessor allows the execution of 
8086 application programs in both Real Mode and in 
the Virtual 8086 Mode. The Virtual 8086 Mode al- 
lows the execution of 8086 applications, while still 
allowing the system designer to take full advantage 
of the 386 SX CPU’s protection mechanism. 


VIRTUAL 8086 ADDRESSING MECHANISM 


One of the major differences between 386 SX CPU 
Real and Protected modes is how the segment se- 
lectors are interpreted. When the processor is exe- 
cuting in Virtual 8086 Mode, the segment registers 
are used in a fashion identical to Real Mode. The 
contents of the segment register are shifted left 4 
_ bits and added to the offset to form the segment 
base linear address. 


The 386 SX Microprocessor allows the operating 
system to specify which programs use the 8086 ad- 
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dress mechanism and which programs use Protect- 
ed Mode addressing on a per task basis. Through 
the use of paging, the one megabyte address space 
of the Virtual Mode task can be mapped to any- 
where in the 4 gigabyte linear address space of the 
386 SX Microprocessor. Like Real Mode, Virtual 
Mode addresses that exceed one megabyte will 
cause an exception. 13. However, these restrictions 
should not prove to be important, because most 
tasks running in Virtual 8086 Mode will simply be 
existing 8086 application programs. 


PAGING IN VIRTUAL MODE 


The paging hardware allows the concurrent running 
of multiple Virtual Mode tasks, and provides protec- 
tion and operating system isolation. Although it is 
not strictly necessary to have the paging hardware 
enabled to run Virtual Mode tasks, it is needed in 
order to run multiple Virtual Mode tasks or to relo- 
cate the address space of a Virtual Mode task to 
physical address space greater than one megabyte. 


The paging hardware allows the 20-bit linear ad- 
dress produced by a Virtual Mode program to be 
divided into as many as 256 pages. Each one of the 
pages can be located anywhere within the maximum 
16 megabyte physical address space of the 386 SX 
Microprocessor. In addition, since CR3 (the Page Di- 
rectory Base Register) is loaded by a task switch, 
each Virtual Mode task can use a different mapping 
scheme to map pages to different physical locations. 
Finally, the paging hardware allows the sharing of 
the 8086 operating system code between multiple 
8086 applications. 


PROTECTION AND !/O PERMISSION BIT MAP 


All Virtual Mode programs execute at privilege level 
3. As such, Virtual Mode programs are subject to all 
of the protection checks defined in Protected Mode. 
This is different than Real Mode, which implicitly is 
executing at privilege level 0. Thus, an attempt to 
execute a privileged instruction in Virtual Mode will 
cause an exception 13 fault. 


The following are privileged instructions, which may 
be executed only at Privilege Level 0. Attempting to 
execute these instructions in Virtual 8086 Mode (or 
anytime CPL= 0) causes an exception 13 fault: 


LIDT; MOVDRn,REG; MOV reg,DRn; 
LGDT; MOV TRn,reg; MOV reg,TRn; 
LMSW; MOV CRn,reg; MOV reg,CRn; 
CLTS; 

HLT; 
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Several instructions, particularly those applying to 


the multitasking and the protection model, are avail- 
able only in Protected Mode. Therefore, attempting 
to execute the following instructions in Real Mode or 
in Virtual 8086 Mode generates an exception 6 fault 


LTR; STR; 

LLDT; SLDT; 
‘LAR; =VERR; 
LSL; . VERW; 


ARPL; 


The instructions which are IOPL sensitive in Protect- 
ed Mode are: 


IN; . STI; 
OUT; = CLI 
INS; 

OUTS; 

REP INS; 
REP OUTS; 


In Virtual 8086 Mode the following instructions are 
IOPL-sensitive: 


‘INTn: STi 
PUSHF: CLI: 
POPF; . IRET; 


The PUSHF, POPF,.and IRET instructions are |OPL- 
sensitive in Virtual 8086 Mode only. This provision 
allows the IF flag to be virtualized to the virtual 8086 
Mode program. The INT n software interrupt instruc- 
- tion is also lOPL-sensitive in Virtual 8086 mode. 

Note that the INT 3, INTO, and BOUND instructions 
are not IOPL-sensitive in Virtual 8086 Mode. 


The I/O instructions that directly refer to addresses 
in the processor’s I/O space are IN, INS, OUT, and 
OUTS. The 386 SX Microprocessor has the ability to 
selectively trap references to specific |1/O address- 
es. The structure that enables: selective trapping is 
the //O Permission Bit Map in the TSS segment (see 
Figures 4.8 and 4.9). The I/O permission map is a bit 
vector. The size of the map and its location in the 
TSS segment are variable. The processor locates 
the I/O permission map by means of the I/O map 
base field in the fixed portion of the TSS. The I/O 
map base field is 16 bits wide and contains the off- 
set of the beginning of the 1/O permission map. 


In protected mode when an I/O instruction (IN, INS, 
OUT or OUTS) is encountered, the processor first 
checks whether CPL. <IOPL. If this condition is true, 
the I/O operation may proceed. If not true, the proc- 
essor checks the I/O permission map (in Virtual 
8086 Mode, the processor consults the map without 
regard for the IOPL). . 
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Each bit in the map corresponds to an I/O port byte © 
address; for example, the bit for port 41 is found at 
1/O map base + 5, bit offset 1. The processor tests 
all the bits that correspond to the 1/O addresses 
spanned by an !/O operation; for example, a double 
word operation tests four bits corresponding to four 
adjacent byte addresses. If any tested bit is set, the 
processor signals a general protection exception. If 
all the tested bits are zero, the I/O operaiens may 
proceed. 


It is not necessary for the I|/O permission map to 
represent all the 1/O addresses. |/O addresses not 
spanned by the map are treated as if they had one- 
bits in the map. The I/O. map base should be at 
least one byte less than the TSS limit, the last byte 
beyond the I/O mapping information must contain 
all 1’s. 


Because the I/O permission map is in the TSS seg- 
ment, different tasks can have different maps. Thus, 
the operating system can allocate ports to a task by. 
changing the I/O permission map in the task’s TSS. 


IMPORTANT IMPLEMENTATION NOTE: Beyond 
the last byte of |/O mapping information in the I/O 
permission bit map must be a byte containing all 1’s. 


The byte of all 1’s must be within the limit of the 


386 SX CPU TSS segment (see Figure 4.8). 


Interrupt Handling | | 


In order to fully support the emulation of an 8086 
machine, interrupts in Virtual 8086 Mode are han- 
dled in a unique fashion. When running in Virtual 
Mode all interrupts and exceptions involve a privi- 
lege change back to the host 386 SX Microproces- 
sor operating system. The 386 SX Microprocessor 
operating system determines if the interrupt comes 
from a Protected Mode application or from a Virtual 
Mode program by examining the VM bit in the 
EFLAGS image stored on the stack. | 


When a Virtual Mode program is interrupted and ex- 
ecution passes to the interrupt routine at level 0, the 
VM bit is cleared. However, the VM bit is still set in 
the EFLAG image on the stack. 


The 386 SX Microprocessor operating system in turn 
handles the exception or interrupt and then returns 
control to the 8086 program. The 386 SX Microproc- 
essor operating system may choose to let the 8086 
operating system handle the interrupt or it may emu- 
late the function of the interrupt handler. For exam- 
ple, many 8086 operating system calls are accessed 


_ by PUSHing parameters on the stack, and then exe- 


cuting an INT n instruction. If the IOPL is set to 0 
then all INF n instructions will be intercepted by the 
386 SX Microprocessor operating system. 
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An 386 SX Microprocessor operating system can 
provide a Virtual 8086 Environment which is totally 
transparent to the application software by intercept- 
ing and then emulating 8086 operating system’s 
calls, and intercepting IN and OUT instructions. 


Entering and Leaving Virtual 8086 Mode 


Virtual 8086 mode is entered by executing a 32-bit 
IRET instruction at CPL=0 where the stack has a 1 
in the VM bit of its EFLAGS image, or a Task Switch 
(at any CPL) to a 386 SX Microprocessor task 
whose 386 SX CPU TSS has a EFLAGS image con- 
taining a 1 in the VM bit position while the processor 
is executing in the Protected Mode. POPF does not 
affect the VM bit but a PUSHF always pushes a 0 in 
the VM bit. 


The transition out of Virtual 8086 mode to protected 
mode occurs only on receipt of an interrupt or ex- 
ception. In Virtual 8086 mode, all interrupts and ex- 
ceptions vector through the protected mode IDT, 
and enter an interrupt handler in protected mode. As 
part of the interrupt processing the VM bit is cleared. 


Because the matching IRET must occur from level 0, 
Interrupt or Trap Gates used to field an interrupt or 
exception out of Virtual 8086 mode must perform an 
inter-level interrupt only to level 0. Interrupt or Trap 
Gates through conforming segments, or through 
segments with DPL>0, will raise a GP fault with the 
CS selector as the error code. 


Task Switches To/From Virtual 8086 Mode 


Tasks which can execute in Virtual 8086 mode must 
be described by a TSS with the 386 SX CPU format 
(type 9 or 11 descriptor). A task switch out of virtual 
8086 mode will operate exactly the same as any oth- 
er task switch out of a task with a 386 SX CPU TSS. 
All of the programmer visible state, including the 
EFLAGS register with the VM bit set to 1, is stored in 
the TSS. The segment registers in the TSS will con- 
tain 8086 segment base values rather than selec- 
tors. 


A task switch into a task described by a 386 SX CPU 
TSS will have an additional check to determine if the 
incoming task should be resumed in Virtual 8086 
mode. Tasks described by 286 format TSSs cannot 
be resumed in Virtual 8086 mode, so no check is 
required there (the FLAGS image in 286 format TSS 
has only the low order 16 FLAGS bits). Before load- 
ing the segment register images from a 386 SX CPU 
TSS, the FLAGS image is loaded, so that the seg- 
ment registers are loaded from the TSS image as 
8086 segment base values. The task is now ready to 
resume in Virtual 8086 mode. 
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Transitions Through Trap and Interrupt Gates, 


and IRET 


A task switch is one way to enter or exit Virtual 8086 
mode. The other method is to exit through a Trap or 
Interrupt gate, as part of handling an interrupt, and 
to enter as part of executing an IRET instruction. 
The transition out must use a 386 SX CPU Trap 
Gate (Type 14), or 386 SX CPU Interrupt Gate (Type 
15), which must point to a non-conforming level 0 
segment (DPL=0) in order to permit the trap han- 
dler to IRET back to the Virtual 8086 program. The 
Gate must point to a non-conforming level 0 seg- 
ment to perform a level switch to level 0 so that the 
matching IRET can change the VM bit. 386 SX CPU 
gates must be used since 286 gates save only the 
low 16 bits of the EFLAGS register (the VM bit will 
not be saved). Also, the 16-bit IRET used to termi- 
nate the 286 interrupt handler will pop only the lower 
16 bits from FLAGS, and will not affect the VM bit. 
The action taken for a 386 SX CPU Trap or interrupt 
gate if an interrupt occurs while the task is executing 
in virtual 8086 mode is aver by the following se- 
quence: 


1. Save the FLAGS register in a temp to push later. 
Turn off the VM, TF, and IF bits. 


2. Interrupt and Trap gates must perform a level 
switch from 3 (where the Virtual 8086 Mode pro- 
gram executes) to level 0 (so IRET can return). 


3. Push the 8086 segment register values onto the 


new stack, in this order: GS, FS, DS, ES. These 
are pushed as 32-bit quantities. Then load these 4 
registers with null selectors (0). 


4. Push the old 8086 stack pointer onto the new 
stack by pushing the SS register (as 32-bits), then 
pushing the 32-bit ESP register saved above. 


5. Push the 32-bit EFLAGS register saved in step 1. 


6. Push the old 8086 instruction onto the new stack 
by pushing the CS register (as 32-bits), then push- 
ing the 32-bit EIP register. 


7. Load up the new CS:EIP value from the interrupt 
gate, and begin execution of the interrupt routine 
in protected mode. 


The transition out of V86 mode performs a level 
change and stack switch, in addition to changing 
back to protected mode. Also all of the 8086 seg- 
ment register images are stored on the stack (be- 
hind the SS:ESP image), and then loaded with null 
(0) selectors before entering the interrupt handler. 
This will permit the handler to safely save and re- 
store the DS, ES, FS, and GS registers as 286 selec- 
tors. This is needed so that interrupt handlers which 
don’t care about the mode of the interrupted pro- 
gram can use the same prologue and epilogue code 
for state saving regardless of whether or not a ‘na- 
tive‘ mode or Virtual 8086 Mode program was inter- 
rupted. Restoring null selectors to these registers 
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before executing the IRET will cause a trap in the 


interrupt handler. Interrupt routines which expect or | 
return values in the segment registers will have to — 


_ obtain/return values from the 8086 register images 


pushed onto the new stack. They will need to know. 


the mode of the interrupted program in order to 
know where to find/return segment registers, and 
also to know how to interpret segment register val- 
ues. . 7 | 


The IRET instruction will perform the inverse of the 


above sequence. Only the extended IRET instruc- 
tion (operand size= 32) can be used and must be 
executed at level 0 to change the VM bit to 1. 


1. If the NT bit in the FLAGS register is on, an inter- 
task return is performed. The current state is 
stored in the current TSS, and the link field in the 
current TSS is used to locate the TSS for the in- 

_ terrupted task which is to be resumed. Otherwise, 
continue with the following sequence: 


2. Read the FLAGS image from SS:8[ESP] into the 
FLAGS register. This will set VM to the value ac- 
tive in the interrupted routine. | 


3. Pop off the instruction pointer CS:EIP. EIP is 
popped first, then a 32-bit word is popped which 
contains the CS value in the lower 16 bits. If 
VM=0, this CS load is.done as a protected mode 
segment load. If VM=1, this will be done as an 
8086 segment load. 


4, Increment the ESP register by 4 to bypass the 
FLAGS image which was ‘popped‘ in step 1. 


5. If VM=1, load segment registers ES, DS, FS, and 
GS from memory locations SS:[ESP+8], 
— SS:[ESP + 12], SS:[ESP + 16], and 
SS:[ESP = 20], respectively, where the new value 
of ESP stored in step 4 is used. Since VM=1, 
these are done as 8086 segment register loads. 


Else if VM=0, check that the selectors in ES, DS, 
FS, and GS are valid in the interrupted routine. 
Null out invalid selectors to trap if an attempt is 
made to access through them. 


6. If RPL(CS)>CPL, pop the stack pointer SS:ESP 
from the stack. The ESP register is popped first, 
followed by 32-bits containing SS in the lower 16 
bits. If VM=0, SS is loaded as a protected mode 
segment register load. If VM=1, an 8086 seg- 
ment register load is used. 


7. Resume execution of the interrupted routine. The 
VM bit in the FLAGS register (restored from the 
interrupt routine’s stack image in step 1) deter- 
mines whether the processor resumes the inter- 
rupted routine in Protected mode or Virtual 8086 
Mode. 
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5.0 FUNCTIONAL DATA 


The 386 SX Microprocessor features a straightfor 
ward functional interface to ine externai nardware. 
The 386 SX Microprocessor has separate parallel 
buses for data and address. The data bus is 16-bits 
in width, and bi-directional. The address bus outputs 
24-bit address values using 23 address lines and 


two byte enable signals. 


The 386 SX Microprocessor.has two selectable ad- 


_ dress bus cycles: address pipelined and non-ad- 


dress pipelined. The address pipelining option al- 
lows as much time as possible for data access by 
starting the pending bus cycle before the present 
bus cycle is finished. A non-pipelined bus cycle 
gives the highest bus performance by executing ev- 
ery bus cycle in two processor CLK cycles. For maxi- 
mum design flexibility, the address pipelining option 
is selectable on a cycle-by-cycle basis. 


The processor’s bus cycle is the basic mechanism 
for information transfer, either from system to proc- 
essor, or from processor to system. 386 SX Micro- 
processor bus cycles perform data transfer in a mini- 
mum of only two clock periods. The maximum trans- 
fer bandwidth at 16 MHz is therefore 16 Mbytes/ 
sec. However, any bus cycle. will be extended for 
more than two clock periods if external hardware 
withholds Bexnowicngement of the cycle. 


The 386 SX Microprocessor can relinquish control of 
its local buses to allow mastership by other devices, 
such as direct memory access (DMA) channels. 
When relinquished, HLDA is the only output pin driv- 
en by the 386 SX Microprocessor, providing near- 
complete isolation of the processor from its system 
(all other output pins are in a float condition). 


5.1 Signal Description Overview 


Ahead is a brief description of the 386 SX Micro- 
processor input and output signals arranged by func- 
tional groups. Note the # symbol at the end of a 
signal name indicates the active, or asserted, state 
occurs when the signal is at a LOW voltage. When 
no # is present after the signal name, the signal is 
asserted when at the HIGH voltage level. 


Example signal: M/IO#— HIGH voltage indicates 
Memory selected 


— LOW voltage indicates 
I/O selected 


The Sigdal deactotions sometimes refer to AC tim- 
ing parameters, such as ‘tos Reset Setup Time‘ and 
‘tog Reset Hold Time.“ The values of these parame- 
ters can be found in Table 7.4. 
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CLOCK (CLK2) 


CLK2 provides the fundamental timing for the 
386 SX Microprocessor. It is divided by two internally 
to generate the internal processor clock used for in- 
struction execution. The internal clock is comprised 
of two phases, ‘phase one‘ and ‘phase two‘. Each 
CLK2 period is a phase of the internal clock. Figure 
5.2 illustrates the relationship. If desired, the phase 
of the internal processor clock can be synchronized 
to a known phase by ensuring the falling edge of the 
RESET signal meets the applicable Sue and hold 
times tos and tog. 


2X CLOCK mma 


16~BIT} no. 
sa D1i5 DATA BUS 


ADS# 


CONTROL READY# 
er ceractsermepntencenmemncitaerermd 


386™ Sx 
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DATA BUS (D45—Do) 


These three-state bidirectional signals provide the 
general purpose data path between the 386 SX Mi- 
croprocessor and other devices. The data bus out- 
puts are active HIGH and will float during bus hold 
acknowledge. Data bus reads require that read-data 
setup and hold times to, and too be met relative to 
CLK2 for correct operation. 


ADDRESS BUS 
 BHE# 


BLE# 


AIWA25 “| o4eBit 
pyTe | ADDRESS 
ENABLES 


W/R# 
D/C#: 
M/lO# | 


BUS CYCLE DEFINITION 
LOCK¢ | 


PEREQ 
BUSY# . 
COPROCESSOR SIGNALLING 


ERROR# 
eee 


POWER CONNECTIONS 


240187-16 


Figure 5.1. Functional Signal Groups 
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CLK2 PERIOD] CLK2 PERIOD} CLK2 PERIOD] CLK2 PERIOD 


CLK2 


PROCESSOR CLOCK 


INTERNAL a“ 


62.5 ns MIN. 


(16 MHz MAX) 
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Figure 5.2. CLK2 Signal and Internal Processor Clock 
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ADDRESS BUS (A23-A,, BHE#, BLE#) 


These three- state outputs provide physical memory 
addresses or 1/O port addresses. Ag3—Aig are LOW 
during \/O transfers except for I/O transfers auto- 
matically generated by coprocessor instructions. 
During coprocessor I/O transfers, Az2—Aj¢ are driv- 
en LOW, and Aogg is driven HIGH so that this ad- 
dress line can be used by external logic to generate 
the coprocessor select signal. Thus, the |1/O address 
driven by the 386 SX Microprocessor for coproces- 
sor commands is 8000F 8H, the I/O addresses driv- 
en by the 386 SX Microprocessor for coprocessor 
data are 8000FCH or 8000FEH for cycles to the 
387™ SX. 


The address bus is capable of addressing 16 mega- 
bytes of physical memory space (OO0Q000H through 
FFFFFFH), and 64 kilobytes of |/O address space 


(000000H through OOFFFFH) for programmed I/O. | 
The address bus is active HIGH and will float during | 


bus hold acknowledge. 


The Byte Enable outputs, BHE # and BLE#, directly © 


indicate which bytes of the 16-bit data bus are in- 
volved with the current transfer. BHE# applies to 
D45-Dg and BLE# applies to D7-Dpo. If both BHE# 


and BLE# are asserted, then 16 bits of data are. 


being transferred. See Table 5.1 for a complete de- 
coding of these signals. The byte enables are active 
LOW and will float during bus hold acknowledge. 


BUS CYCLE DEFINITION SIGNALS (W/R#, D/ 
C#, M/IO#, LOCK#) | 


These three-state outputs define the type of bus cy- 
cle being performed: W/R# distinguishes between 
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write and read cycles, D/C# distinguishes between 
data and control cycles, M/lIO# distinguishes be- 
tween memory and I/O cycles, and LOCK# distin- 
guishes between locked and unlocked bus cycles. 


_ All of these signals are active LOW and will float 


during bus acknowledge. 


The primary bus cycle definition signals are W/R#, 
D/C# and M/IO#, since these are the signals driv- 
en valid as ADS # (Address Status output) becomes 
active. The LOCK # is driven valid at the same time 
the bus cycle begins, which due to address pipelin- 
ing, could be after ADS # becomes active. Exact bus 
cycle definitions, as a function of W/R#, D/C#, and 
M/lO# are given in Table 5.2. : 


LOCK # indicates that other system bus masters are 
not to gain control of the system bus while it is ac- 
tive. LOCK # is activated on the CLK2 edge that be- 
gins the first locked bus cycle (i.e., it is not active at 
the same time as the other bus cycle definition pins) 
and is deactivated when ready is returned at the end 
of the last bus cycle which is to be locked. The be- 
ginning of a bus cycle is determined when READY # 
is returned in a previous bus cycle and another is 
pending (ADS# is active) or by the clock edge in 
which ADS # is driven active if the bus was idle. This 
means that it follows more closely with the write 
data rules when it is valid, but may cause the bus to 
be locked longer than desired. The LOCK# signal 
may be explicitly activated by the LOCK prefix on 
certain instructions. LOCK# is always asserted 
when executing the XCHG instruction, during de- 
scriptor updates, and during the interrupt acknowl- 
edge sequence. 


Table 5.1. Byte Enable Definitions 


Word Transfer 

Byte transfer on upper byte of the data bus, Dj5—Dg 
Byte transfer on lower ve of the data bus, D7—Do 
Never occurs 


Table 5.2. Bus Cycle Definition 


Interrupt Acknowledge 
does not occur 


I/O Data Read 
1/O Data Write 


Memory Code Read 


‘Halt: 


Shutdown: 


Address = 2. Address = 0 


BHE# = 1 
BLE# =0 
Memory Data Read 
Memory Data Write 


BHE# = 1 
BLE# = 0 
| Some Cycles 
Some Cycles 
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BUS CONTROL SIGNALS (ADS#, READY #, 
NA#) 


The following signals allow the processor to indicate 
when a bus cycle has begun, and allow other system 
hardware to control address pipelining and bus cycle 
termination. 


Address Status (ADS #) 


This three-state output indicates that a valid bus cy- 
cle definition and address (W/R#, D/C#, M/IO#, 
BHE#, BLE# and Ao3—A}) are being driven at the 
386 SX Microprocessor pins. ADS# is an active 
LOW output. Once ADS ¥# is driven active, valid ad- 
dress, byte enables, and definition signals will not 
change. In addition, ADS# will remain active until its 
associated bus cycle begins (when READY # is re- 
turned for the previous bus cycle when running pipe- 
lined bus cycles). When address pipelining is uti- 
lized, maximum throughput is achieved by initiating 
bus cycles when ADS# and READY # are active in 
the same clock cycle. ADS# will float during bus 
hold acknowledge. See sections Non-Pipelined Ad- 
dress and Pipelined Address for additional infor- 
mation on how ADS # is asserted for different bus 
states. 


Transfer Acknowledge (READY #) 


This input indicates the current bus cycle is com- 
plete, and the active bytes indicated by BHE# and 
BLE# are accepted or provided. When READY # is 
sampled active during a read cycle or interrupt ac- 
knowledge cycle, the 386 SX Microprocessor latch- 
es the input data and terminates the cycle. When 
READY # is sampled active during a write cycle, the 
processor terminates the bus cycle. 


READY # is ignored on the first bus state of all bus 
cycles, and sampled each bus state thereafter until 
asserted. READY # must eventually be asserted to 
acknowledge every bus cycle, including Halt Indica- 
tion and Shutdown Indication bus cycles. When be- 
ing sampled, READY #. must always meet setup and 
hold times tig and too for correct operation. 


Next Address Request (NA#) 


This is used to request address pipelining. This input 
indicates the system is prepared to accept new val- 
ues of BHE#, BLE#, Ao3-A;, W/R#, D/C# and 
M/lO# from the 386 SX Microprocessor even if the 
end of the current cycle is not being acknowledged 
on READY #. If this input is active when sampled, 
‘the next address is driven onto the bus, provided the 
next bus request is already pending internally. NA# 
is ignored in CLK cycles in which ADS # or READY # 
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is activated. This signal is active LOW and must sat- 
isfy setup and hold times t;5 and tyg¢ for correct op- 
eration. See Pipelined Address and Read and 
Write Cycles for additional information. 


BUS ARBITRATION SIGNALS (HOLD, HLDA) 


This section describes the mechanism by which the 
processor relinquishes control of its local buses 
when requested by another bus master device. See 
Entering and Exiting Hold Acknowledge for addi- 
tional information. 


Bus Hold Request (HOLD) 


This input indicates some device other than the 
386 SX Microprocessor requires bus mastership. 
When control is granted, the 386 SX Microprocessor 
floats Ao3-A;, BHE#, BLE#, Dy5-Dp, LOCK #, M/ 
lO#, D/C#, W/R# and ADS#, and then activates 
HLDA, thus entering the bus hold acknowledge 
state. The local bus will remain granted to the re- 
questing master until HOLD becomes inactive. 
When HOLD becomes inactive, the 386 SX Micro- 
processor will deactivate HLDA and drive the local 
bus (at the same time), thus terminating the hold 
acknowledge condition. 


HOLD must remain asserted as long as any other 
device is a local bus master. External pull-up resis- 
tors may be required when in the hold acknowledge 
state since none of the 386 SX Microprocessor float- 
ed outputs have internal pull-up resistors. See 
Resistor Recommendations for additional informa- 


tion. HOLD is not recognized while RESET is active. 


lf RESET is asserted while HOLD is asserted, RE- 
SET has priority and places the bus into an idle 
state, rather than the hold acknowledge (high-im- 
pedance) state. 


HOLD is a level-sensitive, active HIGH, synchronous 
input. HOLD signals must always meet setup and 
hold times to3 and to, for correct operation. 


Bus Hold Acknowledge (HLDA) 


When active (HIGH), this output indicates the 386 
SX Microprocessor has relinquished control of its lo- 
cal bus in response to an asserted HOLD signal, and | 
is in the bus Hold Acknowledge state. 


The Bus Hold Acknowledge state offers near-com- 
plete signal isolation. In the Hold Acknowledge 
state, HLDA is the only signal being driven by the 
386 SX Microprocessor. The other output signals or 
bidirectional signals (Dj5-Dp, BHE#, BLE#, Ao3- 
A,, W/R#, D/C#, M/IO#, LOCK# and ADS#) are 
in a high-impedance state so the requesting bus 
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master may control them. These pins remain OFF 
throughout the time that HLDA remains active (see 
Table 5.3)). Pull-up resistors may be desired on sev- 
eral signals to avoid Spurious activity when no bus 
master is driving them. See Resistor Recommen- 
dations for additional information. 


When the HOLD signal is made inactive, the 386 SX 
Microprocessor will deactivate HLDA and drive the 
' bus. One rising edge on the NMI input is remem- 
bered for processing after the HOLD input is negat- 
ed. Oo 


Table 5.3. Output pin State During HOLD 


HLDA — | 
LOCK#, M/IO#, D/C#, W/R#, 


In addition to the normal usage of Hold Acknowl- 
edge with DMA controllers or master peripherals, 
the near-complete isolation has particular attractive- 
ness during system test when test equipment drives 
the system, and in hardware fault-tolerant applica- 
tions. : 


HOLD Latencies 
The maximum possible HOLD latency depends on 


_ the software being executed. The actual HOLD la- 
tency at any time depends on the current bus activi- 


ty, the state of the LOCK# signal (internal to the 


CPU) activated by the LOCK # prefix, and interrupts. 
The 386 SX Microprocessor will not honor a HOLD 
_ request until the current bus operation is complete. 


The 386 SX Microprocessor breaks 32-bit data or — 


I/O accesses into 2 internally locked 16-bit bus cy- 
cles; the LOCK # signal is not asserted. The 386 SX 
Microprocessor breaks unaligned 16-bit or 32-bit 
data or I/O accesses into 2 or 3 internally locked 
16-bit bus cycles. Again, the LOCK# signal is not 
asserted but a HOLD request will not be recognized 
until the end of the entire transfer. 


Wait states affect HOLD latency. The 386 SX Micro- 
processor will not honor a HOLD request until the 
end of the current bus operation, no matter how 
many wait states are required. Systems with DMA 
where data transfer is critical must insure that 
READY # returns sufficiently soon. 


ADS #, Ao3-Ay, BHE#, BLE#, Dy5-Do| 
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COPROCESSOR INTERFACE SIGNALS 
(PEREQ, BUSY #, ERROR #) 


in-the foliowing sections are descriptions of signals 
dedicated to the numeric coprocessor interface. In 
addition to the data bus, address bus, and bus cycle 
definition signals, these following signals control 
communication between the 386 SX Microprocessor 
and its 387™ SX processor extension. 


Coprocessor Request (PEREQ) 


When asserted (HIGH), this input signal indicates a 
coprocessor request for a data operand to be trans- 
ferred to/from memory by the 386 SX Microproces- 
sor. In response, the 386 SX Microprocessor trans- 
fers information between the coprocessor and 
memory. Because the 386 SX Microprocessor has 
internally stored the coprocessor opcode being exe- 
cuted, it performs the requested data transfer with 
the correct direction and memory address. 


PEREQ is a level-sensitive active HIGH asynchro- 
nous signal. Setup and hold times, tog and tgo, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This signal is 
provided with a weak internal pull-down resistor of 
around 20 K-ohms to ground so that it will not float 
active when left unconnected. 


Coprocessor Busy (BUSY #) 


When asserted (LOW), this input indicates the co- 
processor is still executing an instruction, and is not 
yet able to accept another. When the 386 SX Micro- 
processor encounters any coprocessor instruction 
which operates on the numerics stack (e.g. load, 
pop, or arithmetic operation), or the WAIT instruc- 
tion, this input is first automatically sampled until it is 
seen to be inactive. This sampling of the BUSY # 
input prevents overrunning the execution of a previ- 
ous coprocessor instruction. | 


The FNINIT, FNSTENV, FNSAVE, FNSTSW, 
FFNSTCW and FNCLEX coprocessor instructions are 


allowed to execute even if BUSY # is active, since 
these instructions are used for coprocessor initializa- 
tion and exception-clearing. 


BUSY # is an active LOW, level-sensitive asynchro- 


nous signal. Setup and hold times, tag and tgo, rela- 
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tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This pin is pro- 
vided with a weak internal pull-up resistor of around 
20 K-ohms to Vcc so that it will not float active when 
left unconnected. 


BUSY # serves an additional function. If BUSY # is 
sampled LOW at the falling edge of RESET, the 386 
SX Microprocessor performs an internal self-test 
(see Bus Activity During and Following Reset. If 
BUSY # is sampled HIGH, no self-test is performed. 


Coprocessor Error (ERROR #) 


When asserted (LOW), this input signal indicates 
that the previous coprocessor instruction generated 
a coprocessor error of a type not masked by the 
coprocessor’s control register. This input is automat- 


ically sampled by the 386 SX Microprocessor when 


a coprocessor instruction is encountered, and if ac- 
tive, the 386 SX Microprocessor generates excep- 
tion 16 to access the error-handling software. 


Several coprocessor instructions, generally those 
which clear the numeric error flags in the coproces- 
sor or save coprocessor state, do execute without 
the 386 SX Microprocessor generating exception 16 
even if ERROR# is active. These instructions are 
FNINIT, FNCLEX, FNSTSW, FNSTSWAX, 
FNSTCW, FNSTENV and FNSAVE. | | 


ERROR# is an active LOW, level-sensitive asyn- 
chronous signal. Setup and hold times, ta9 and tgp, 
relative to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This pin is pro- 
vided with a weak internal pull-up resistor of around 
20 K-ohms to Vcc so that it will not float active when 
left unconnected. 


INTERRUPT SIGNALS (INTR, NMI, RESET) 


The following descriptions cover inputs that can in- 
terrupt or suspend execution of the processor’s cur- 
rent instruction stream. 


Maskable interrupt Request (INTR) 


_ When asserted, this input indicates a request for in- 
terrupt service, which can be masked by the 386 SX 
CPU Flag Register IF bit. When the 386 SX Micro- 
‘processor responds to the INTR input, it performs 
two interrupt acknowledge bus cycles and, at the 
end of the second, latches an 8-bit interrupt vector 
on D7-Do to identify the source of the interrupt. 


INTR is an active HIGH, level-sensitive asynchro- 
nous signal. Setup and hold times, to7 and tag, rela- 
tive to the CLK2 signal must be met to guarantee 
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recognition at a particular clock edge. To assure rec- 
ognition of an INTR request, INTR should remain 
active until the first interrupt acknowledge bus cycle 
begins. INTR is sampled at the beginning of every 
instruction in the 386 SX Microprocessor’s Execu- 
tion Unit. In order to be recognized at a particular 
instruction boundary, INTR must be active at least 
eight CLK2 clock periods before the beginning of the 
instruction. If recognized, the 386 SX Microproces- 
sor will begin execution of the interrupt. 


Non-Maskable Interrupt Request (NMI)) 


This input indicates a request for interrupt service 
which cannot be masked by software. The non- 
maskable interrupt request is always processed ac- 
cording to the pointer or gate in slot 2 of the interrupt 
table. Because of the fixed NMI slot assignment, no 
interrupt acknowledge cycles are performed when 
processing NMI. 


NMI is an active HIGH, rising edge-sensitive asyn- 
chronous signal. Setup and hold times, to7 and tog, 
relative to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. To assure rec- 
ognition of NMI, it must be inactive for at least eight 
CLK2 periods, and then be active for at least eight 
CLK2 periods before the beginning of the instruction 
boundary in the 386 SX Microprocessor’s Execution 
Unit. 


Once NMI processing has begun, no additional 
NMI’s are processed until after the next IRET in- 
struction, which is typically the end of the NMI serv- 
ice routine. If NMI is re-asserted prior to that time, 
however, one rising edge on NMI will be remem- 
bered for processing after executing the next IRET 
instruction. 


Interrupt Latency 


The time that elapses before an interrupt request is 
serviced (interrupt latency) varies according to sev- 
eral factors. This delay must be taken into account 
by the interrupt source. Any of the following factors 
can affect interrupt latency: 


1. If interrupts are masked, an INTR request will not - 
be recognized until interrupts are reenabled. 


2. If an NMI is currently being serviced, an incoming 


NMI request will not be recognized until the 
386 SX Microprocessor encounters the IRET in- 
struction. 


3. An interrupt request is recognized only on an in- 
. struction boundary of the 386 SX Microproces- 
sor’s Execution Unit except for the following cas- 
es: 
— Repeat string instructions can be interrupted 
after each iteration. | 
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— lf the instruction loads the Stack Segment reg- 
ister, an interrupt is not processed until after 
the following instruction, which should be an 
ESP. This allows the entire stack pointer to be 

_ loaded without interruption. 


— If an instruction sets the interrupt flag (enabling 
: interrupts), an interrupt is not processed until 
after the next instruction. 


The longest latency occurs when the interrupt re- 
quest arrives while the 386 SX Microprocessor is 
executing a long instruction such as multiplication, 
division, or a task-switch in the protected mode. 


4. Saving the Flags register and CS:EIP registers. 


5. If interrupt service routine requires a task switch, 


time must be allowed for the task switch. 
6. If the interrupt service routine saves registers that 


are not automatically saved by the 386 SX Micro- 
processor. 
RESET 


This input signal suspends any operation in progress 


and places the 386 SX Microprocessor in a known — 


reset state. The 386 SX Microprocessor is reset by 
asserting RESET for 15 or more CLK2 periods (80 or 
more. CLK2 periods before requesting self-test). 
When RESET is active, all other input pins, except 
FLT#, are ignored, and all other bus pins are driven 
to an idle bus state as shown in Table 5.5. If RESET 
and HOLD are both active at a point in time, RESET 
takes priority even if the 386 SX Microprocessor was 
in a Hold Acknowledge state prior to RESET active. 


RESET is an active HIGH, level-sensitive synchro- 
nous signal. Setup and hold times, tes and tog, must 
be met in order to assure proper operation of the 
386 SX Microprocessor. 


Table 5.5. Pin State (Bus Idle) During Reset 


| PinName | Signal Level During Reset 


 ADS# 
D15-Do 

 BHE#, BLE# 
A23-Ay 


W/R# 
D/C# 
M/lO# 
LOCK# 
HLDA 


5.2 Bus Transfer Mechanism 


All data transfers occur as a result of one or more 
bus cycles. Logical data operands of byte and word 
lengths may be transferred without restrictions on 


| Byte E Enable 
Signal 
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physical address alignment. Any byte boundary may 
be used, although two physical bus cycles are per- 
formed as required for unaligned operand transfers. 


The 386 SX Microprocessor address signals are de- 
signed to simplify external system hardware. Higher- 
order address bits are provided by Ao3-A;. BHE# 
and BLE # provide linear selects for the two oe of 
the 16-bit data bus. 


Byte Enable outputs BHE # and BLE# are asserted 
when their associated data bus bytes are involved 
with the present bus cycle, as listed in Table 5.6. 


Table 5.6. Byte Enables and Associated Data 
and Operand Bytes 


Associated Data Bus Signals 


BLE # D7-—Do | (byte 0 — least significant) 
BHE# eles (byte 1 — most significant) 
Each bus cycle is composed of at least two bus 
states. Each bus state requires one processor clock 
period: Additional bus states added to a single bus 


cycle are called wait states. See section 5.4 Bus 
Functional Description. 


5.3 Memory and I/O Spaces 


Bus cycles may access physical memory space or |/ 
O space. Peripheral devices in the system may eéi- 
ther be memory-mapped, or |/O-mapped, or both. 
As shown in Figure 5.3, physical memory addresses 
range from OOOOO0OH to OFFFFFFH (16 megabytes) 
and !/O addresses from O00000H to OOFFFFH (64 
kilobytes). Note the |/O addresses used by the auto- 
matic |/O cycles for coprocessor communication are 
8000F8H to 8000FFH, beyond the address range of 
programmed I/O, to allow easy generation of a co- 
processor chip select signal using the A23 and M/ 
lO# signals. 


5.4 Bus Functional Description 


The 386 SX Microprocessor has separate, parallel 
buses for data and address. The data bus is 16-bits 
in width, and bidirectional. The address bus provides 
a 24-bit value using 23 signals for the 23 upper-order 
address bits and 2 Byte Enable signals to directly 
indicate the active bytes. These buses are interpret- 
ed and controlled by several definition signals. . 


The definition of each bus cycle is given by three 
signals: M/IO#, W/R# and D/C#. At the same 
time, a valid address is present on the byte enable 
signals, BHE# and BLE#, and the other address 
signals Ag3-A;. A status signal, ADS#, indicates 
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FFFFFFH 


NOT 
ACCESSIBLE 


PHYSICAL 


MEMORY 8000FFH 
) g000rBH L___=——} — COPROCESSOR 


16=MBYTE (NOTE) 


NOT 
ACCESSIBLE 


3 OOFFFFH ACCESSIBLE 
| 64 kBYTE PROGRAMMED 
0CO000H 0000004 1/0 SPACE 


PHYSICAL MEMORY SPACE 1/0 SPACE 
NOTE: 240187-18 


Since A23 is HIGH during automatic communication with coprocessor, A23 HIGH and M/IO# LOW can be used to 
easily generate a coprocessor select signal. 


Figure 5.3. Physical Memory and |/O Spaces 


CYCLE 1 CYCLE 2 CYCLE 3 
NON=PIPELINED | NON=PIPELINED | NON=PIPELINED 
(READ) (READ) (READ) 


11 T2 TM 12 11 12 
61162161 |62]61 |o2] 61 [62/41 [92161 |o2] 61 


CLK2 
(INPUT) 
BHE#,BLE#,A1-A23, 
M /l0#, D/C#, W/R# ( valioi (K valio2 (X vALio3 | 
(OUTPUTS) 
ADS# 
(eur 


neue 


READY# 
(INPUT) 


LOCK# 
(OUTPUT) 


DO=D15 
(INPUT DURING READ) 
240187-19 


Fastest non-pipelined bus cycles consist of T1 and T2 


Figure 5.4. Fastest Read Cycles with Non-pipelined Address Timing 
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‘when the 386 SX Microprocessor issues a new bus 
cycle definition and address. ao 4 


Collectively, the address bus, data bus and all asso- 
ciated control signals are referred to simply as ‘the 
bus’. When active, the bus performs one of the bus 
cycles below: 


. Read from memory space 

. Locked read from memory space 

. Write to memory space | 

. Locked write to memory space 

. Read from I/O space (or coprocessor) | 
. Write to |/O space (or coprocessor) 

. Interrupt acknowledge (always locked) 

. Indicate halt, or indicate shutdown 


ONOoaKh WD = 


Table 5.2 shows the encoding of the bus cycle defi- 


nition signals for each bus cycle. See Bus Cycle 


Definition Signals for additional information. 


CYCLE 1 


PIPELINED 


(READ) 


TIP T2P 
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When the 386 SX Microprocessor bus is not per- 
forming one of the activities listed above, it is either 
Idle or in the Hold Acknowledge state, which may be 
detected externally. The idle state can be identified 
by the 386 SX Microprocessor giving no further as- 
sertions on its address strobe output (ADS#) since 
the beginning of its most recent bus cycle, and the 
most recent bus cycle having been terminated. The 
hold acknowledge state is identified by the 386 SX © 
Microprocessor asserting its hold acknowledge 
(HLDA) output. . 


The shortest time unit of bus activity is a bus state. A. 
bus state is one processor clock period (two CLK2 
periods) in duration. A complete data transfer occurs 
during a bus cycle, composed of two or more bus 
states. | 


The fastest 386 SX Microprocessor bus cycle re- 
quires only two bus states. For example, three con- 
secutive bus read cycles, each consisting of two bus 
states, are shown by Figure 5.4. The bus states in 
each cycle are named T1 and T2. Any memory or I/ 
O address may be accessed by such a two-state 
bus cycle, if the external hardware is fast enough. 


CYCLE 2 
PIPELINED PIPELINED 
(READ) (READ) 


TIP T2P TIP T2P 


CYCLE 3 


o1|o2|41 [62/61 [62] 61 [92/61 [62] 01 | 92 


CLK2 | | : : 
- eee 
BHE#,BLE#,A1~A23, 


M/l0#, D/C#, W/R# [ 
! (OUTPUTS) -— 


NA¢ 
(INPUT) 


READY# | 
(INPUT) 


LOCK# 
(OUTPUT) 


DO=D15 


(INPUT DURING READ) LIN 


Fastest pipelined bus cycles consist of T1P and T2P 


{vauip 1X vAuin2 .K vALiD3 | VALID 4 


“ADS# AY La pa } 
(OUTPUT) 


\4 \4 \4 4 
mM vauidi A valin2 MX vauo3 
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Figure 5.5. Fastest Read Cycles with Pipelined Address Timing 
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Every bus cycle continues until it is acknowledged 
by the external system hardware, using the 386 SX 
Microprocessor READY # input. Acknowledging the 
bus cycle at the end of the first T2 results in the 
shortest bus cycle, requiring only T1 and 72. If 
READY # is not immediately asserted however, T2 
states are repeated indefinitely until the READY # 
input is sampled active. 


The address pipelining option provides a choice of 
bus cycle timings. Pipelined or non-pipelined ad- 
dress timing is selectable on a cycle-by-cycle basis 
with the Next Address (NA#) input. 


When address pipelining is selected the address 
(BHE#, BLE# and Ao3-—A;) and definition (W/R#, 
D/C#, M/IO# and LOCK#) of the next cycle are 
available before the end of the current cycle. To sig- 
nal their availability, the 386 SX Microprocessor ad- 


CYCLE 1 
NON=PIPELINED 
(WRITE) 

T1 T2 TI 


CLK2 [ 


PROCESSOR CLK [ 


BHE#,BLE#, 
Al=A23, 
M/l0 #, D/C # 


CYCLE 2 
NON=PIPELINED 
(READ) 
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dress status output (ADS#) is asserted. Figure 5.5 
illustrates the fastest read cycles with pipelined ad- 
dress timing. 


Note from Figure 5.5 the fastest bus cycles using 
pipelined address require only two bus states, 
named T1P and T2P. Therefore cycles with pipe- 
lined address timing allow the same data bandwidth 
as non-pipelined cycles, but address-to-data access 
time is increased by one T-state time compared to 
that of a non-pipelined cycle. 


READ AND WRITE CYCLES 


Data transfers occur as a result of bus cycles, classi- 


' fied as read or write cycles. During read cycles, data 
~ is transferred from an external device to the proces- 


sor. During write cycles, data is transferred from the 
processor to an external device. 


CYCLE 3 IDLE 
NON=PIPELINED 
(WRITE) 


CYCLE 4 IDLE 
NON=PIPELINED 
(READ) 


12 T1 T2 T1 T2 


ta aol haul nll hala lhl ll il 


XKXXAK VALID DK VALID 2X ——S XXXX) 


XXXAK cance 


(vaio « XXX 


rah | la 


TR 


ie AXXXXKXXXKXXXKXKXAKAXKKKXKKLKLKXLKN 


wor, SERRKREKEREA | AXHVON | ATKEGD. | ASAROEKOD | AER 
| | 


END CYCLE 1 


END CYCLE 2 


END CYCLE 3 END CYCLE 4 


Lock ¢ [ NKAAMA Q VALID KALI 2K VALID S AXA vations XXKA 


po- p15[_ ° 


a ane 
--2--f--{ [oot |_-4=+-CN {Tout feeb == 
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Idle states are shown here for diagram variety only. Write cycles are not always followed by an idle state. An active bus 


cycle can immediately follow the write cycle. 


Figure 5.6. Various Bus Cycles with Non-Pipelined Acdress (zero wait states) 
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Two choices of address timing are dynamically se- 
lectable: non-pipelined or pipelined. After an idle bus 
state, the processor always uses non-pipelined ad- 
dress timing. However the NA# (Next Address) in- 
put may be asserted to select pipelined address tim- 
ing for the next bus cycle. When pipelining is select- 
ed and the 386 SX Microprocessor has a bus re- 
quest pending internally, the address and definition 
of the next cycle is made available even before the 
current bus cycle is acknowledged by READY #. » 


Terminating a read or write cycle, like any bus cycle, 
requires acknowledging the cycle by asserting the 
READY # input. Until acknowledged, the processor 
inserts wait states into the bus cycle, to allow adjust- 
ment for the speed of any external device. External 
hardware, which has decoded the address and bus 
cycle type, asserts the READY # input at the appro- 
priate time. 


IDLE CYCLE 1 
NON=PIPELINED 

_ (READ) 
1 =| =T2 


~ CLK2 [ 


PROCESSOR CLK [ 


BHE#,BLE¥, 
“A1=A23, 


CYCLE 2 
NON=PIPELINED 
(WRITE) 
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At the end of the second bus state within the bus 
cycle, READY # is sampled. At that time, if external 
hardware acknowledges the bus cycle by asserting 
READY #, the bus cycie terminates as shown in Fig- 
ure 5.6. If READY # is: negated as in Figure 5.7, the 
386 SX Microprocessor executes another bus state 
(a wait state) and READY # is sampled again at the 
end of that state. This continues indefinitely until the 
cycle is acknowledged by READY # asserted. 


When the current cycle is acknowledged, the 
386 SX Microprocessor terminates it. When a read 
cycle is acknowledged, the 386 SX Microprocessor 
latches the information present at its data pins. 
When a write cycle is acknowledged, the 386 SX 
CPU’s write data remains valid throughout phase 
one of the next bus state, to provide write data hold 
time. 


CYCLE 3 
NON=PIPELINED 
(READ) - 


T2 i T1 T2 


blll lbs allel al bl ball lle 


00000 OC ZOE 000 QTE 0500 


et | 
Vane 


M/l0 #,D/C # 
wre oe 


eat 


nag [ sD howe: We) a. XY 


—-Reav#L XXX RRKREKHION | ARR? 
= 


RKK KKK KKK KK sc Ag 
RQ | AXKXKXXXKXY | XXX 
END CYCLE 1 
vock#[ XXXXKK vans XX _ifvauo2, Ss XXXXK _fvauos| —sKKKX 
: | a aa 
' DO= D1 s[ e 
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Idle states are shown here for dagen variety only. ve cycles are not always followed by an idle state. An active bus 
cycle can immediately follow the write cycle. 


Figure 5.7. Various Bus Cycles with Non-Pipelined Address (various number of wait states) 
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_ Non-Pipelined Address 


Any bus cycle may be performed with non-pipelined 
address timing. For example, Figure 5.6 shows a 
mixture of read and write cycles with non-pipelined 
address timing. Figure 5.6 shows that the fastest 
possible cycles with non-pipelined address have two 
bus states per bus cycle. The states are named T1 
and T2. In phase one of T1, the address signals and 
bus cycle definition signals are driven valid and, to 
signal their availability, address strobe (ADS#) is 
simultaneously asserted. 


During read or write cycles, the data bus behaves as 
follows. If the cycle is a read, the 386 SX Microproc- 
essor floats its data signals to allow driving by the 
external device being addressed. The 386 SX Mi- 
croprocessor requires that all data bus pins be 
at a valid logic state (HIGH or LOW) at the end of 
each read cycle, when READY # is asserted. The 
system MUST be designed to meet this require- 
ment. If the cycle is a write, data signals are driven 
by the 386 SX Microprocessor beginning in phase 
two of T1 until phase one of the bus state following 
cycle acknowledgment. 
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Figure 5.7 illustrates non-pipelined bus cycles with 
one wait state added to Cycles 2 and 3. READY # is 
sampled inactive at the end of the first T2 in Cycles 
2 and 3. Therefore Cycles 2 and 3 have T2 repeated 
again. At the end of the second T2, READY # is 
sampled active. 


When address pipelining is not used, the address 
and bus cycle definition remain valid during all wait 
states. When wait states are added and it is desir- 
able to maintain non-pipelined address timing, it is 
necessary to negate NA# during each T2 state ex- 
cept the last one, as shown in Figure 5.7 Cycles 2 
and 3. If NA# is sampled active during a T2 other 
than the last one, the next state would be T2! or T2P 
instead of another T2. 


When address pipelining is not used, the bus states 
and transitions are completely illustrated by Figure 
5.8. The bus transitions between four possible 
states, T1, T2, Tj, and Tj. Bus cycles consist of T1 
and T2, with T2 being repeated for wait states. Oth- 
erwise the bus may be idle, Tj, or in the hold ac- 
knowledge state Tp. | 


HOLD ASSERTED 


RESET 
ASSERTED 


HOLD NEGATED 
NO REQUEST 


REQUEST PENDING e 
HOLD NEGATED 


Bus States: 


HOLD NEGATED e 
REQUEST PENDING 


p © HOLD NEGATED « 


READY# ASSERTED ¢ 
HOLD NEGATED e 
REQUEST PENDING 


READY# NEGATED e 
NA# NEGATED 
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T1—first clock of a non-pipelined bus cycle (386™ SX CPU drives new address and asserts ADS #). 
T2—subsequent clocks of a bus cycle when NA# has not been sampled asserted in the current bus cycle. 


Ti—idle state. 
Th—hold acknowledge state (886 SX CPU asserts HLDA). 
The fastest bus cycle consists of two states T1 and T2. 


Four basic bus states describe bus operation when not using pipelined address. 


Figure 5.8. Bus States (not using pipelined address) 
5-913 


intel 
Bus cycles always begin with T1. T1 always leads to 
T2. |f a bus cycle is not acknowledged during T2 and 
NA# is inactive, T2 is repeated. When a cycle is 
acknowiedged during T2, the foliowing state will be 
T1 of the next bus cycle if a bus request is pending 


internally, or T; if there is no bus request pending, or 
Th, if the HOLD input is being asserted. 


Use of pipelined address allows the 386 SX Micro- 
processor to enter three additional bus states not 
shown in Figure 5.8. Figure 5.12 is the complete bus 
state diagram, including pipelined address cycles. 


Pipelined Address . 


Address pipelining is the option of requesting the 
address and the bus cycle definition of the next in- 


IDLE CYCLE 1 
NON=PIPELINED | 


(WRITE) 


T1 T2 


CLK2 [ 


—s ee 
arenes 0000 (wos) i sate es ee ann 
=n 


- A253, 
M/IO #, D/C # 


W/R# [ xX 
aos ¢ [ 
oie, 


READY # g 


CYCLE2. 
NON=PIPELINED 
(READ) 


om bo XXXX 
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ternally pending bus cycle before the current bus 
cycle is acknowledged with READY# asserted. 
ADS# is asserted by the 386 SX Microprocessor 


when the next address is issued. The address pipe- 


lining option is controlled on a cycle- Pye basis 
with the NA# input signal. 


Once a bus cycle is in progress and the cuirient ad- 
dress has been valid for at least. one entire bus 
state, the NA# input is sampled at the end of every 
phase one until the bus cycle is acknowledged. Dur- 
ing non-pipelined bus cycles NA# is sampled at the 
end of phase one in every T2. An example is Cycle 2 
in Figure 5.9, during which NA# is sampled at the 
end of phase one of every T2 (it was asserted once 
during the first T2 and has no further effect during 
that bus cycle). 


CYCLE 4 IDLE 
PIPELINED 


(READ) | 


CYCLE 3 
PIPELINED 


(WRITE) 


lantern | <i eennennem 


11 T2 3 Ti 


VALID 4 


j++ | 


wiee 
ben (a (ARR Ww) =a OOK 
OE Pa ABH 7X hm orm 1 


cece [| OCT XX KR X 


ponois[ -q----4--{_[ out] _>- 


240187-24 


Following any idle bus state (Ti), addresses are non-pipelined. Within non-pipelined bus cycles, NA# is only sampled 
| during wait states. Therefore, to begin address pipelining during a group of non-pipelined bus cycles requires a non- pipe- 


lined cycle with at least one wait state (Cycle 2 above). 


Figure 5.9. Transitioning to Pipelined Address During Burst of Bus Cycles 
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lf NA# is sampled active, the 386 SX Microproces- 
sor is free to drive the address and bus cycle defini- 
tion of the next bus cycle, and assert ADS#, as 
soon as it has a bus request internally pending. It 
may drive the next address as early as the next bus 
state, whether the current bus cycle is acknowl- 
edged at that time or not. 


Regarding the details of address pipelining, the 
386 SX Microprocessor has the following character- 
istics: 


1; 


The next address may appear as early as the bus 
state after NA# was sampled active (see Figures 
5.9 or 5.10). In that case, state T2P is entered 
immediately. However, when there is not an inter- 
nal bus request already pending, the next address 
will not be available immediately after NA# is as- 
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ure 5.11 Cycle 3). Provided the current bus cycle 
isn’t yet acknowledged by READY ¥# asserted, 
T2P will be entered as soon as the 386 SX Micro- 
processor does drive the next address. External 
hardware should therefore observe the ADS# 
output as confirmation the next address is actual- 
ly being driven on the bus. 


. Any address which is validated by a pulse on the 


ADS# output will remain stable on the address 
pins for at least two processor clock periods. The 
386 SX Microprocessor cannot produce a new 
address more frequently than every two proces- 
sor clock periods (see Figures 5.9, 5.10, and 
5.11). 


. Only the address and bus cycle definition of the 


very next bus cycle is available. The pipelining ca- 
pability cannot look further than one bus cycle 


serted and T2l is entered instead of T2P (see Fig- ahead (see Figure 5.11 Cycle 1). 
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Following any bus state (Ti) the address is always non-pipelined and NA# is only sampled during wait states. To start 
address pipelining after an idle state requires a non-pipelined cycle with at least one wait state (cycle 1 above) 

The pipelined cycles (2, 3, 4 above) are shown with various numbers of wait states. 


Figure 5.10. Fastest Transition to Pipelined Address Following Idle Bus State 
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The complete bus state transition diagram, including 
operation with pipelined address is given by Figure 
_ 5.12. Note it.is a superset of the diagram for non- 
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The fastest bus cycle with pipelined address. con- 
sists of just two bus states, T1P and T2P (recall for 
non-pipelined address it is T1 and T2). T1P is the 


ninelined address only, and the three additional bus first bus state of a i piper ned cycle. 
states for pipelined address are drawn in bold. 
CYCLE 1 CYCLE 2 CYCLE 3 CYCLE 4 
PIPELINED PIPELINED PIPELINED PIPELINED 
(WRITE) (READ) (WRITE) (READ). 
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Figure 5.11. Details of Address Pipelining During Cycles with Wait States 
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| sampled asserted in the current bus cycle. 


T2I—-subsequent clocks of a bus cycle when NA# has been 
sampled asserted in the current bus cycle but there is not yet 
an internal bus request pending (886 SX CPU will not drive new | 


address or assert ADS #). 

T2P—subsequent clocks of a bus cycle when NA# has been 

sampled asserted in the current bus cycle and there is an inter- 
| nal bus request pending (386 SX CPU drives new address and 

asserts ADS #). 

-T1P—first clock of a pipelined bus cycle. READY# NEGATED 
Ti—idle state. 240187-27 
Th—hold acknowledge state (386 SX CPU asserts HLDA). 

Asserting NA# for pipelined address gives access to three 
more bus states: T2!, T2P and T1P. 

| Using pipelined address, the fastest bus cycle consists of T1P 
and T2P. 


Figure 5.12. Complete Bus States (including pipelined address) 
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Initiating and Maintaining Pipelined Address 


Using the state diagram Figure 5.12, observe the 
transitions from an idie state, Tj, io the beginning of 
a pipelined bus cycle T1P. From an idle state, Tj, the 
first bus cycle must begin with T1, and is therefore a 
non-pipelined bus cycle. The next bus cycle will be 
pipelined, however, provided NA# is asserted and 


the first bus cycle ends in a T2P state (the address 


for the next bus cycle is driven during T2P). The fast- - 


est path from an idle state to a bus cycle with pipe- 
lined address is shown in bold below: 
Tj, Tj, Tj, 1 - T2 - T2P, T1P - T2P, 
idle non-pipelined pipelined 
states cycle cycle 


T1-T2-T2P are the states of the bus cycle that es- 
tablish address pipelining for the next bus cycle, 
which begins with T1P. The same is true after ¢ a bus 
hold state, shown below: 
Th: Th» Th, 11 - T2- T2P, T1P - T2P, 
hold acknowledge non-pipelined pipelined 
states : cycle cycle 


The transition to ipolines address is shown func- 
tionally by Figure 5.10 Cycle 1. Note that Cycle 1 is 
used to transition into pipelined address timing for 
the subsequent Cycles 2, 3 and 4, which are pipe- 


lined. The NA# input is asserted at the appropriate 


time to select address pipelining for Cycles 2, 3 and 
4... oar ae 


Once a bus cycle is in progress and the current ad- 


dress has been valid for one entire bus state, the 
_ NA# input is sampled at the end of every phase one 


until the bus cycle is acknowledged. Sampling be- 
gins in T2 during Cycle 1 in Figure 5.10. Once NA# 


is sampled active during the current cycle, the - 


386 SX Microprocessor is free to drive a new ad- 


dress and bus cycle definition on the bus as early as — 


the next bus state. In Figure 5.10 Cycle 1 for exam- 
ple, the next address is driven during state T2P. 
Thus Cycle 1 makes the transition to pipelined ad- 
dress timing, since it begins with T1 but ends with 
T2P. Because the address for Cycle 2 is available 
before Cycle 2 begins, Cycle 2 is called a pipelined 


bus cycle, and it begins with T1P. Cycle 2 begins as 
soon as READY # asserted terminates Cycle 1. 

Exampies Of transition DUS Cycles are Figure 5.10 
Cycle 1 and Figure 5.9 Cycle 2. Figure 5.10 shows. 
transition during the very first cycle after an idle bus 
state, which is the fastest possible transition into ad- 
dress pipelining. Figure 5.9 Cycle 2 shows a tran- 
sition cycle occurring during a burst of bus cycles. In 
any case, a transition cycle is the same whenever it 


occurs: it consists at least of T1, T2 (NA# is assert- 


ed at that time), and T2P (provided the 386 SX Mi- 
croprocessor has an internal bus. request already 
pending, which it almost always has). T2P states are 


- repeated if wait states are added to the cycle. 


Note that only three states (T1, T2 and T2P) are 
required in a bus cycle performing a transition from 
non-pipelined address into pipelined address timing, 


for example Figure 5.10 Cycle 1. Figure 5.10 Cycles 


2,3 and 4 show that address pipelining can be main- 
tained with two- state bus cycles consisting only of 
T1P and TaP. 


Once a pipelined bus cycle is in progress, pipelined 
timing is maintained for the next cycle by asserting 
NA# and detecting that the 386 SX Microprocessor 
enters T2P during the current bus cycle. The current 


~ bus cycle must end in state T2P for pipelining to be 
_ maintained in the next cycle. T2P is identified by the 


assertion of ADS#. Figures 5.9 and 5.10 however, 
each show pipelining ending after Cycle 4 because 
Cycle 4 ends in T2l. This indicates the 386 SX Micro- 
processor didn’t have an internal bus request prior 
to the acknowledgement of Cycle 4. If a cycle ends 
with a T2 or T2I, the next cycle will not be pipelined. 


Realistically, address pipelining is almost always 


-maintained as long as NA# is sampled asserted. 


This is so because in the absence of any other re- 
quest, a code prefetch request is always internally 
pending until the instruction decoder and code pre- 
fetch queue are completely full. Therefore, address 
pipelining is maintained for long bursts of bus cycles, 
if the bus is available (i.e., HOLD inactive) and NA# 
is sampled active in each of the bus cycles. 
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INTERRUPT ACKNOWLEDGE (INTA) CYCLES 


In response to an interrupt request on the INTR in- 
put when interrupts are enabled, the 386 SX Micro- 
processor performs two interrupt acknowledge cy- 
cles. These bus cycles are similar to read cycles in 
that bus definition signals define the type of bus ac- 
tivity taking place, and each cycle continues until ac- 
knowledged by READY # sampled active. 


The state of A» distinguishes the first and second 
interrupt acknowledge cycles. The byte address 
driven during the first interrupt acknowledge cycle is 
4 (Ao3-A3, Ai, BLE# LOW, Ao and BHE# HIGH). 
The byte address driven during the second interrupt 
acknowledge cycle is 0 (Ao3—Aj, BLE# LOW, and 
BHE# HIGH). 


PREVIOUS INTERRUPT 
CYCLE ACKNOWLEDGE 
* CYCLE 1 


T2 T1 T2 T2 


CLK2 [ 


PROCESSOR CLK [ 


AAAS 
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The LOCK # output is asserted from the beginning 
of the first interrupt acknowledge cycle until the end 
of the second interrupt acknowledge cycle. Four idle 
bus states, T;, are inserted by the 386 SX Microproc- 
essor between the two interrupt acknowledge cycles 
for compatibility with spec TRHRL of the 8259A In- 
terrupt Controller. 


During both interrupt acknowledge cycles, Dy5—Do 
float. No data is read at the end of the first interrupt 
acknowledge cycle. At the end of the second inter- 
rupt acknowledge cycle, the 386 SX Microprocessor 
will read an external interrupt vector from D7—Do of 
the data bus. The vector indicates the specific inter- 
rupt number (from 0-255) requiring service. 


IDLE INTERRUPT 
(4 BUS STATES) ACKNOWLEDGE 
CYCLE 2 


Ti Ti Ti 


MSE or —— TAIT X) “San XXX OR 


a a 


XXX =e momento |_| bore KY 
z Dc! ————~—# 


ro eh OXY 


ecko | re XX 


READY#|_ eee XXX XY Ni AXX RXEK XXX RXERERK REKY KKKEY~ OL AK 


IGNORED 


ps-p15| bd 


VECTOR 


IGNORED 
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Interrupt Vector (0-255) is read on DO-D7 at end of second interrupt Acknowledge bus cycle. 
Because each Interrupt Acknowledge bus cycle is followed by idle bus states. asserting NA# has no practical effect. 
Choose the approach which is simplest for your system hardware design. 


Figure 5.13. Interrupt Acknowledge Cycles 
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HALT INDICATION CYCLE 


The execution unit halts as a result of executing a 
HLT instruction. Signaling its entrance into the halt 
state, a halt indication cycle is performed. The halt 
indication cycle is identified by the state of the bus 


CYCLE 1 _ CYCLE 2 


definition signals shown on page 40, Bus Cycle 


- Definition Signals, and an address of 2. The halt 


indication cycle must be acknowledged by READY # 
asserted. A halted 386 SX Microprocessor resumes 
execution when INTR (if interrupts are enabled), NMI 
or RESET is asserted. 
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Figure 5.14. Example Halt Indication Cycle from Non-Pipelined Cycle 
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SHUTDOWN INDICATION CYCLE 


The 386 SX Microprocessor shuts down as a result 
of a protection fault while attempting to process a 
double fault. Signaling its entrance into the shut- 
down state, a shutdown indication cycle is per- 
formed. The shutdown indication cycle is identified 
by the state of the bus definition signals shown in 
Bus Cycle Definition Signals and an address of 0. 
The shutdown indication cycle must be acknowl- 
edged by READY# asserted. A shutdown 386 SX 
Microprocessor resumes execution when NMI or 


386™ SX MICROPROCESSOR 


ENTERING AND EXITING HOLD 
ACKNOWLEDGE 


The bus hold acknowledge state, Ty, is entered in 
response to the HOLD input being asserted. In the 
bus hold acknowledge state, the 386 SX Microproc- 
essor floats all outputs or bidirectional signals, ex- 
cept for HLDA. HLDA is asserted as long as the 
386 SX Microprocessor remains in the bus hold ac- 
knowledge state. .In the bus hold acknowledge state, 
all inputs except HOLD, FLT# and RESET are ig- 
nored. 


RESET is asserted. 
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CYCLE 2 IDLE. 
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ae 
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apse L _y— 


nag [ XA AKXXXA  AKXXXKXXXKKX 
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Figure 5.15. Example Shutdown Indication Cycle from Non-Pipelined Cycle 
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Th may be entered from a bus idle state as in Figure 
5.16 or after the acknowledgement of the current 
physical bus cycle if the LOCK# signal is not assert- 
ed, as in Figures 5.17 and 5.18. 


Th is. exited in response to the HOLD input being 
negated. The following state will be Tj as in Figure 
5.16 if no bus request is pending. The following. bus 
state will be T1 if a bus request is internally pending, 
as in Figures 5.17 and 5.18. Tp is exited in response 
to RESET being asserted. | 

lf a rising edge occurs on the edge-triggered NMI 
input while in Tp, the event is remembered as a non- 
maskable interrupt 2 and is serviced when Tp is exit- 


ed unless the 386 SX Microprocessor is reset before 
Th is exited. 


| IDLE 


Ti 
CLK2 [ 


_ PROCESSOR cuk[ 
HOLD [ 


HLDA [ 


BHE#,BLE#, 
A1-A23, M/l0# 
D/C#, W/R# 


rose 
wae 
exon 
Locke[ 
oo-015 [' - 


NOTE: 
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RESET DURING HOLD ACKNOWLEDGE — 


RESET being asserted takes priority over HOLD be- 
ing asserted. If RESET is asserted while HOLD re- 
mains asserted, the 386 SX Microprocessor drives 
its pins to defined states during reset, as in Table 
5.5 Pin State During Reset, and performs internal 
reset activity as usual. 


lf HOLD remains asserted when RESET is inactive, 
the 386 SX Microprocessor enters the hold acknowl- 
edge state before performing its first bus cycle, pro- 


_ vided HOLD is still asserted when the 386 SX Micro- 


Th 


XXX XX) = = = = 4 (FLOATING)* === 


processor would otherwise perform its first bus cy- 
cle. | 
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For maximum design flexibility the 386™ SX CPU has no internal pullup resistors on its outputs. Your design may require 
an external pullup on ADS# and other outputs to keep them negated during float periods. 


Figure 5.16. Requesting Hold from Idle Bus 
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FLOAT 


Activating the FLT # input floats all 386 SX bidirec- 
tional and output signals, including HLDA. Asserting 
FLT # isolates the 386 SX from the surrounding cir- 
cuitry. 


As the 386 SX is packaged in a surface mount 
PQFP, it cannot be removed from the motherboard 
when In-Circuit Emulation (ICE) is needed. The 
FLT # input allows the 386 SX to be electrically iso- 
lated from the surrounding circuitry. This allows con- 
nection of an emulator to the 386 SX PQFP without 
removing it from the PCB. This method of emulation 
is referred to as ON-Circuit Emulation (ONCE). 


ENTERING AND EXITING FLOAT 


FLT # is an asynchronous, active-low input. It is rec- 
ognized on the rising edge of CLK2. When recog- 
nized, it aborts the current bus cycle and floats the 
outputs of the 386 SX (Figure 5.20). FLT # must be 
held low for a minimum of 16 CLK2 cycles. Reset 
should be asserted and held asserted until after 
FLT# is deasserted. This will ensure that the 386 
SX will exit float in a valid state. 
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Asserting the FLT # input unconditionally aborts the 
current bus cycle and forces the 386 SX into the 
FLOAT mode. Since activating FLT # unconditional- 
ly forces the 386 SX into FLOAT mode, the 386 SX 
is not guaranteed to enter FLOAT in a valid state. 
After deactivating FLT #, the 386 SX is not guaran- 
teed to exit FLOAT mode in a valid state. This is not 
a problem as the FLT # pin is meant to be used only 
during ONCE. After exiting FLOAT, the 386 SX must 
be reset to return it to a valid state. Reset should be 
asserted before FLT# is deasserted. This will en- 
sure that the 386 SX will exit float in a valid state. 


FLT # has an internal pull-up resistor, and if it is not 
used it should be unconnected. 


BUS ACTIVITY DURING AND FOLLOWING 
RESET 


RESET is the highest priority input signal, capable of 
interrupting any processor activity when it is assert- 
ed. A bus cycle in progress can be aborted at any 


_ stage, or idle states or bus hold acknowledge states 


discontinued so that the reset state is established. 
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NOTE: 
HOLD is a synchronous input and can be asserted at any CLK2 edge, provided setup and hold (te3 and to4) require- 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


Figure 5.17. Requesting Hold from Active Bus (NA# inactive) 
5-923 | 


ite 


RESET should remain asserted for at least 15 CLK2 
periods to ensure it is recognized throughout the 
- 386 SX Microprocessor, and at least 80 CLK2 peri- 
ods if self-test is going to be requested at the falling 
edge. RESET asserted pulses less than 15 CLK2 
periods may not be recognized. RESET pulses less 
than 80 CLK2 periods followed by a self-test may 
cause the self-test to report a ' failure when no true 
tenure exists. 


Provided the RESET falling Bigs meets setup and | 


hold times tos and tog, the internal processor clock 
phase is defined at that time as illustrated as Elune 
5.19 and Figure 7.7. 
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A self-test may be requested at the time RESET 
goes inactive by having the BUSY # input at a LOW 
level as shown in Figure 5.19. The self-test requires 
approximately (229 + 60) CLK2 periods to com- 
plete. The self-test duration is not affected by the 
test results. Even if the self-test indicates a problem, 
the 386 SX Microprocessor attempts to proceed 
with the reset sequence afterwards. : 


After the RESET falling edge (and after the self-test 
if it was requested) the 386 SX Microprocessor per- 
forms an internal initialization sequence for approxi- 
mately 350 to 450 CLK2 Poe 


HOLD CYCLE 2 
ACKNOWLEDGE | NON=PIPELINED 
(READ) 


T1 T2 


— 2 


(FLOATING) 


roam) ae 
seme 


READY# [ te | KKK XX aeeeae TK), 


(NEGATED, OR LAST LOCKED CYCLE) 
3 oan) 
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Fee ee 
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HOLD is a scneereie input and can be asserted at any CLK2 edge, provided setup and hold (t23 and t24) require- 
ments are met. This waveform is useful for determining. Hold Acknowledge latency. 


Figure 5.18. Requesting Hold from Idle Bus (NA# active) 
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INTERNAL CYCLE 1 


RESET INITIALIZATION 
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NOTES: 

1. BUSY # should be held stable for 8 CLK2 periods before and after the CLK2 period in which RESET falling edge 
occurs. | : . 

2. If self-test is requested the outputs remain in their reset state as shown here. 


RESET Se ee EEE 
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Figure 5.20. Entering and Exiting, FLT # 
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5.5 Self-test Signature 


Upon completion of self-test (if self-test was re- 
quested by driving BUSY # LOW at the falling edge 
of RESET) the EAX register will contain a signature 
of OOOOOOOOH indicating the 386 SX Microprocessor 
passed its self-test of microcode and major PLA 
contents with no problems detected. The passing 
signature in EAX, OOOOQQOOOH, applies to all revision 
levels. Any non-zero signature indicates the unit is 
faulty. | _ 


5.6 Component and Revision 
identifiers - | 


To assist users, the 386 SX Microprocessor after 
reset holds a component identifier and revision iden- 
tifier in its DX register. The upper 8 bits of DX hold 
23H as identification of the 386 SX Microprocessor 
(the lower nibble, 03H, refers to the Intel886 DX Ar- 
chitecture. The upper nibble, 02H, refers to the sec- 
ond member of the Intel886 DX Family). The lower 8 
bits of DX hold an 8-bit unsigned binary number re- 
lated to the component revision level. The revision 
identifier will, in general, chronologically track those 
component steppings which are intended to have 
certain improvements or distinction from previous. 
steppings. The 386 SX Microprocessor revision 


identifier will track that of the 386 DX CPU where - 


possible. 


The revision identifier is intended to assist users to a 
practical extent. However, the revision identifier val- 
ue is not guaranteed to change with every stepping 
revision, or to follow a completely uniform numerical 
sequence, depending on the type or intention of re- 
vision, or manufacturing materials required to be 
changed. Intel has sole discretion over these char- 
acteristics of the component. ; 

Table 5.7. Component and 

Revision Identifier History 


Stepping Revision Identifier 
AO 04H 
, B 05H 
| C 08H 


5.7 Coprocessor Interfacing 


The 386 SX Microprocessor provides an automatic 
interface for the Intel 387 SX numeric floating-point 
coprocessor. The 387 SX coprocessor uses an !/O 
mapped interface driven automatically by the 386 
SX Microprocessor and assisted by three dedicated 
signals: BUSY #, ERROR# and PEREQ. 


As the 386 SX Microprocessor begins supporting a 
coprocessor instruction, it tests the BUSY # and ER- 
ROR# signals to determine if the coprocessor can 
accept its next instruction. Thus, the BUSY# and 
ERROR# inputs eliminate the need for any 
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‘preamble’ bus cycles for communication between 
processor and coprocessor. The 387™ SX can be 
given its command opcode immediately. The dedi- 
cated signals provide instruction synchronization, 
and eliminate the need of using the WAIT opcode 
(9BH) for 387™ SX instruction synchronization (the 
WAIT opcode was required when the 8086 or 8088 
was used with the 8087 coprocessor). 


Custom coprocessors can be included in 386 SX Mi- 
croprocessor based systems by memory-mapped or 
|/O-mapped interfaces... Such coprocessor interfac- 
es allow a completely custom protocol, and are not 
limited to a set of coprocessor protocol ‘primitives’. 
Instead, memory-mapped or |/O-mapped interfaces 


| ‘may use all applicable instructions for high-speed 


coprocessor communication. The BUSY # and ER- 
ROR# inputs of the 386 SX Microprocessor may 
also be used for the custom coprocessor interface, if 
such hardware. assist is desired. These signals can 


_ be tested by the WAIT opcode (9BH). The WAIT in- 


struction will wait until the BUSY # input is inactive 


. (interruptable by an NMI or enabled INTR input), but 


generates an exception 16 fault if the ERROR# pin 
is active when the BUSY # goes (or is) inactive. If 
the custom coprocessor interface is memory- 
mapped, protection of the addresses used for the 
interface can be provided with the 386 SX CPU’s on- 
chip paging or segmentation mechanisms. If the 
custom interface is |/O-mapped, protection of the 


_.interface can be provided with the IOPL (I/O Privi- 


lege Level) mechanism. 


The 387™ SX numeric coprocessor interface is 1/O 
mapped as shown in Table 5.8. Note that the 
3871™ SX coprocessor interface addresses are be- 
yond the OH-OFFFFH range for programmed |/O. 
When the 386 SX. Microprocessor supports the > 
387™ SX coprocessor, the 386 SX Microprocessor 
automatically generates bus cycles to the coproces- 
sor interface addresses. 

Table 5.8. Numeric Coprocessor Port Addresses 


Address in 386™ SX | 387T™ SX Coprocessor 
CPU I/O Space Register 

8000F8H Opcode Register _ 

8000FCH/8000FEH* | Operand Register 


*Generated as 2nd bus cycle during Dword transfer. 


To correctly map the 387™ SxX registers to the ap- 
propriate |/O addresses, connect the CMDO and 
CMD1 lines of the 387™ SX as listed in Table 5.9. 
Table 5.9. Connections for CMDO 
and CMD1 Inputs for the 387™ SX 


Connect directly 


to 386™ SX CPU A2 signal 
Connect to ground. 
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Software Testing for Coprocessor Presence 


When software is used to test for coprocessor 
(387 SX) presence, it should use only the following 
coprocessor opcodes: FINIT, FNINIT, FSTCW mem, 
FSTSW mem and FSTSW AX. To use other coproc- 
essor opcodes when a coprocessor is known to be 
not present, first set EM = 1 in the 386 SX CPU’s 
CRO register. 


6.0 PACKAGE THERMAL 
SPECIFICATIONS 


The 386 SX Microprocessor is specified for opera- 


tion when case temperature is within the range of . 


0°C-100°C. The case temperature may be mea- 
sured in any environment, to determine whether the 
386 SX Microprocessor is within specified operating 
range. The case temperature should be measured at 
the center of the top surface opposite the pins. 


The ambient temperature is guaranteed as long as 
T, is not violated. The ambient temperature can be 
calculated from the @j, and @jq from the following 
equations: 


Tj = T. + P* Bic 
Tg = Tj — P*6ja 


Values for 6j4 and 9j¢ are given in table 6.1 for the 
100 lead fine pitch. 9j4 is given at various airflows. 
Table 6.2 shows the maximum Tg, allowable (without 
exceeding T,) at various airflows. Note that Tg can 
be improved further by attaching ‘fins’ or a ‘heat 
sink’ to the package. 
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7.0 ELECTRICAL SPECIFICATIONS 


The following sections describe recommended elec- 
trical connections for the 386 SX Microprocessor, 
and its electrical specifications. 


7.1 Power and Grounding 


The 386 SX Microprocessor is implemented in 
CHMOS IV technology and has modest power re- 
quirements. However, its high clock frequency and 
47 output buffers (address, data, control, and HLDA) 
can cause power surges as multiple output buffers 
drive new signal levels simultaneously. For clean on- 
chip power distribution at high frequency, 14 Vcc 
and 18 Vss pins separately feed functional units of 
the 386 SX Microprocessor. 


Power and ground connections must be made to all 
external Vcc and Vss pins of the 386 SX Microproc- 
essor. On the circuit board, all Vcc pins should be 
connected on a Vcc plane and all Vss pins should 
be connected on a GND plane. 


POWER DECOUPLING RECOMMENDATIONS 


Liberal decoupling capacitors should be placed near 
the 386 SX Microprocessor. The 386 SX Microproc- 
essor driving its 24-bit address bus and 16-bit data 
bus at high frequencies can cause transient power 
surges, particularly when driving large capacitive 
loads. Low inductance capacitors and interconnects 
are recommended for best high frequency electrical 
performance. Inductance can be reduced by short- 
ening circuit board traces between the 386 SX Mi- 
croprocessor and decoupling capacitors as much as 
possible. 


Table 6.1. Thermal Resistances (°C/Watt) 6j- and 9ja. 


100 Lead 
Fine Pitch 


: ja versus Airflow - ft/min (m/sec) 
0 400 600 800 1000 
(0) _ (2.03) (3.04) (4.06) (5.07) 
nee 


ne 


Tabie 6.2. Maximum Tag at various airflows. 


Package Frequency 


Fine Pitch | 20 MHz | 


zs 


NOTE: 


Ta(°C) versus Airflow - ft/min (m/sec) 


200 400 600 800 1000 
(1.01) {| (2.03) (3.04) (4.06) (5.07) 


[80 


a 
: 


The numbers in Table 6.2 were calculated using an Icc of 200 mA at 16 MHz and 230 mA at 20 MHz, which is representa- 
tive of the worst case Icc at Tc = 100°C with the outputs unloaded. 
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Table 7.1. Recommended Resistor Pull-ups to Vcc 


~ Lightiy pull ADS # inactive during 


20 K-Ohm + 10% 


20 K-Ohm + 10% 


386™ SX CPU hold acknowledge 


states 


Lightly pull LOCK # ' inactive during 
386™ SX CPU hold aexnowlecge 


RESISTOR RECOMMENDATIONS 


The ERROR #, FLT# and BUSY # inputs have inter- 
nal pull-up resistors of approximately 20 K-Ohms 
and the PEREQ input has an internal pull-down re- 
sistor of approximately 20 K-Ohms built into the 386 
SX Microprocessor to keep these signals inactive 
when the 387 SX is not present in the system (or 
temporarily removed from its socket). 


In typical designs, the external pull-up resistors 
shown in Table 7.1 are recommended. However, a 
particular design. may have reason to adjust the re- 


sistor values recommended here, or alter the use of 


pull-up resistors in other ways. 


OTHER CONNECTION RECOMMENDATIONS 


For reliable operation, always connect unused in- 
puts to an appropriate signal level. N/C pins should 
always remain unconnected. Connection of N/C 
pins to Vcc or Vss will result in component mal- 
function or incompatibility with future steppings 
of the 386 SX Microprocessor. 


Particularly when not using interrupts or bus hold (as 
when first prototyping), prevent any chance of spuri- 
Ous activity by connecting these associated inputs to 
GND: 


Pin Signal 
40  @ INTR 
38 NMI ~ 
4 ae HOLD 


states 


lf not using address pipelining, connect pin 6, NA#, 
through a pull-up in the range of 20 K-Ohms to Vcc. 


7.2 Maximum Ratings 


Table 7.2. Maximum Ratings 


Maximum Rating 


Storage temperature —65 °C to 150°C 

Case temperature under bias | —65 °C to 110°C 

Supply voltage with respect 
to Vss 

Voltage on other pins | 


— .5V to 6.5V 
— .5V to (Vcc + .5)V 


Table 7.2 gives stress ratings only, and functional 
operation at the maximums is not guaranteed. Func- 
tional operating conditions are given in section 7.3, 
D.C. Specifications, and section * 4, A.C. Specifi- 
cations. 


Extended exposure to the Maximum Ratings may af- 
fect device reliability. Furthermore, although the 
386 SX Microprocessor contains protective circuitry 
to resist damage from static electric discharge, al- 


~ ways take precautions to avoid ae static voltages 


or electric fields. 
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7.3 D.C. Specifications 
Functional operating range: Vcc = 5V+10%; Tcasp=0°C to 100°C 


Table 7.3. 386T™ SX D.C. Characteristics 


386TMSX 
Parameter 20 MHz, 16 MHz, 
| 12 MHz (LP Only) 


| 


A23-A1,D15-Do 

BHE #,BLE#,W/R#, 
D/C#,M/IO#,LOCK#, 
ADS # ,HLDA 


Test 
Condition 


Output high voltage 
lon= —1mA: Ao3-Ay1,0D45-Do 
lon = —0.2 mA: Ao3-A1,D45—-Do 
loH= —0.9mMA: BHE#,BLE#,W/R#, 
D/C#,M/lIO#,LOCK#, 
ADS #,HLDA | 
lon= —0.18 mA: BHE#,BLE#,W/R#, 
D/C#,M/IO#,LOCK#, 
ADS #,HLDA 


Input leakage current +15 vA | OVS Vins Vec 
(for all pins except PEREQ, BUSY #, 
FLT # and ERROR #) 


lH Input Leakage Current 200 pA | Vin=2.4V, Note 1 
(PEREQ pin) | 

Ne Input Leakage Current | — 400 LA | Vit =0.45V, Note 2 | 
(BUSY #, ERROR # and FLT # Pins) | 


ILo Output leakage current | BA | 0.45V<VoutTsVec 


Supply Current 
CLK2 = 4 MHz: with 20, 16, 
or 12 MHz 386 SX (LP) 
CLK2 = 24 MHz: with 12 MHz 386 SX 
CLK2 = 32 MHz: with 16 MHz 386 SX 
CLK2 = 40 MHz: with 20 MHz 386 SX 


CIN Input capacitance | 
Output or |/O capacitance 
CLK2 Capacitance | 


Tested at the minimum operating frequency of the part. 


NOTES: 

1. PEREQ input has an internal pull-down resistor. 

2. BUSY #, FLT # and ERROR # inputs each have an internal pull-up resistor. 
3. lcc max measurement at worst case load, frequency, Vcc and temperature. 
4. Not 100% tested. 


Icc typ = 70 mA, Note 3 

Icc typ = 140 mA, Note 3 
Icc typ = 175 mA, Note 3 
loc Typ = 20 mA, Note’3 


co = ee ae: 
> > > > 
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7.4 A.C. Specifications 


The A.C. specifications given in Table 7.4 consist of 
output delays, input setup requirements and input 
hold requirements. All A.C. specifications are rela- 
tive to the CLK2 rising edge crossing the 2.0V level. 


A.C. spec measurement is defined by Figure 7.1. In- 
puts must be driven to the voltage levels indicated 
by Figure 7.1 when A.C. specifications are mea- 
sured. Output delays are specified with minimum 
and maximum limits measured as shown. The mini- 
mum delay times are hold times provided to external 
Circuitry. Input setup and hold times are specified 


CLK2 


OUTPUTS 
(A1~A23,BHE#,BLE¥, 
ADS#,M/10#,D/C#, 
W/R#,LOCK#,HLDA) 


VALID 
ouTPUTn | 


(D0-D15) 


INPUTS 
(N/A#,INTR,NMI) 


[ 
[ 


INPUTS 
(READY#,HOLD, 
FLT#,ERROR#,BUSY#, 


PEREQ,DO-D15) 


LEGEND | 

A — Maximum Output Delay Spec 
B — Minimum Output Delay Spec 
C — Minimum Input Setup Spec 
D — Minimum Input Hold Spec 
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as minimums, defining the smallest acceptable sam- 
pling window. Within the sampling window, a syn- 
chronous input signal must be stable for correct op- 
eration. 


Outputs NA#, W/R#, D/C#, M/IO#, LOCK#, 
BHE#, BLE#, Aog—-A; and HLDA only change at 
the beginning of phase one. Dy5-Dpo (write cycles) 
only change at the beginning of phase two. The 
READY#, HOLD, BUSY#, ERROR#, PEREQ, 
FLT # and Dy5—Dp (read cycles) inputs are sampled 
at the beginning of phase one. The NA#, INTR and 
NMI inputs are sampled at the beginning of phase 
two. 


VALID VALID 
outpuTn 1°¥ DS 'SV output n+ 


VALID 
INPUT 


VALID — 


1.5¥ 
INPUT A 


240187-35 


Figure 7.1. Drive Levels and Measurement Points for A.C. Specifications 
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A.C. SPECIFICATONS TABLES 
Functional operating range: Vcc = 5V +10%; Tcase = 0°C to 100°C 


Table 7.4. 386™ SX A.C. Characteristics 


Symbol 


J 
nm 
on 
2 
ah 
—_ 
nm 
on 
~ 
© 


Operating Frequency 

ty CLK2 Period 

CLK2 HIGH Time 

CLK2 HIGH Time 

CLK2 LOW Time 

CLK2 LOW Time 

CLK2 Fall Time 

CLK2 Rise Time 

Ao3-A, Valid Delay | 


toa 
tob 


one 


3a 
t3b 
t4 
ts 


- 


6 


oP 


oak, (ee) 


Ao3-A 4 Float Delay 


BHE#, BLE#, LOCK# 
Valid Delay 


BHE#, BLE#, LOCK# 
Float Delay 


M/lO# D/C# Valid Delay 
W/R#, ADS# Valid Delay 


Min | 
hee 
Lord 
Lee 
hits 
Lo 
ane 
W/R#, M/IO#, D/C#, ka 
ADS # Float Delay 
Lo. 
ee 


7 
t 


tioa 
t410b 
t14 


2 D45—-Dpo Write Data 
| Valid Delay 


D45—-Dpo Write Data 
Float Delay 


HLDA Valid Delay 
NA# Setup Time 
NA# Hold Time 
READY # Setup Time 
READY # Hold Time 


to; D45—-Dop Read Data 
Setup Time 


ome 


3 


t14 
15 


or 


16 
tig 


ete 


20 D15-Do Read Data 
Hold Time 


HOLD Setup Time 
HOLD Hold Time 
RESET Setup Time 
RESET Hold Time — 


tag 
to4 
tas 
t26 


20MHz | 16MHz nro 
Parameter 386 SX 386 SX 
| Max | Min | Max | 
20. ae 
fe 


(Note 1) 
Lae 
ee 
74 


Tra 


Half CLK2 Frequency 


| fins | 73 [atv 
| fens | 73 at (Vcc—0.8)V3) 
| dt ons | 7.3 | atave) 

| | ons | 7.3 at 0.8V(3) 

PB ome, nee (Vcc — 0.8)V to 0.8V(3) 
fe | ns | 73 | o8vioVoc—0av 
| 36 | ns | 7.5 | C= 120pF( 
7.6 (Note 1) 


C, = 75 pF(4) 


C, = 120 pF(4) 
(Note 1) 


5 
6 
5 
6 
5 
6 
5 
6 
5 | C. = 75 pF) 
4 

4 


7 
7 
7 
7 
7 
7 
7 
7 
7 
7 
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Functional operating range: Voc = 5V +10%; Tcase = 0°C to 100°C 


cee 2 3 | 
Table 7.4. 386™ SX A.C. Characteristics (Continued) 


Parameter 


NMI, INTR Setup Time 
_NMI, INTR. Hold Time 


PEREQ, ERROR #, BUSY#, 14 

FLT # Setup Time 

PEREQ, ERROR #, BUSY#, 
~FLT# Hold Time 


Symbol 


me ag lee 
Oo | 


Tr | weto2) | 


| Max 

oe | 

ae 

I Bed I il Ks 
an 


16 MHz 
386 SX 


to7 
tog 
tog 


Half CLK2 Frequency 


CLK2 HIGH Time 

CLK2 LOW Time 

CLK2 LOW Time 

CLK2 Fall Time - (Voc — 0.8V) to 0.8V(3) 
CLK2 Rise Time 


Azg~Ai Valid Delay | 4 | 
t7 Ao3—-A; Float Delay 


BHE#, BLE#, LOCK# | 4 — 
Valid Delay 

BHE#, BLE#, LOCK# 

Float Delay 


tio M/lO#, D/C#, W/R#, 
- | ADS# Valid Delay 
M/lIO#, D/C#, W/R#, 
ADS # Float Delay 
4 


(Note 1) 


Cy. = 120 pF(4) 


BN G 


oy 
tig | D15-D0 Write 4 
| Data Valid Delay | 


| Data Float Delay - 
tie oid Ti - 


(Note 1) 


5 |C. = 75 pF(4) 


a! 


NA# Hold Time 
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Functional operating range: Vcc = 5V+10%; Tcase = O°C to 100°C 


Table 7.5. Low Power (LP) 386™ SX A.C. Characteristics (Continued) 


386 SX 386 SX 
Min | Max 


Min 
| 26 | 
5 


Note 2) 


7.4 | (Note 2) 


ah oh oh anh ss 


NOTES: 

1. Float condition occurs when maximum output current becomes less than Ilo in magnitude. Float delay is not 100% 
tested. 

2: These inputs are allowed to be asynchronous to CLK2. The setup and hold specifications are given for testing purposes, 
to assure recognition within a specific CLK2 period. . 

3: These are not tested. They are guaranteed by design characterization. 

4: Tested with C, set at 50 pf and derated to support the indicated distributed capacitive load. See Figures 7.8 though 7.10 
for the capacitive derating curve. 


A.C. TEST LOADS A.C. TIMING WAVEFORMS 


386™ sx CPU 
OUTPUT 


i 


240187-36 


240187-37 


Figure 7.2. A.C. Test Loads Figure 7.3. CLK2 Waveform 
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1 


e+e | 
raw | TX AW 
ele 
roo [ SX FAW 
O--e- | 


po-o's T TW \ oa NY 
BUSY#, +—9)—>|+—G0) 
ae AW: GED AN 
FLT# @ —@- 
NAF ke | AW eee WW 
wm, ; \\\ San Ww 


- 240187-38 


Figure 7.4. A.C. Timing Waveforms—input Setup and Hold Timing | 


CLK2 fe 


| win Max = - 
aap VALID n RAY | VALID nt 
@ aaa : 
W/R#,M/IO#, | aN ctr MAX 
D/C#, ADS¥ -_YALID n EXO, YALIO 
| Oa Max 
A1-A23 I | VALID n RANA VALID n+ 


VB P| MIN MAX 
(output) | VALID n AX VALID n#1 


~  HLDA [ 
240187-39 


Figure 7.5. A.C. Timing Waveforms—Output Valid Delay Timing 
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Ti OR T1 


CLK2 4 


BHE#,BLE#, 
LOCK# 


W/R#, M/lO#, 
D/C#, ADS# 


A1-A23 [ 
AX 
( 


ofr apy cats 


. | |MIN-_|_—[M 
po-o15 [ meer 
| ALSO APPLIES TO DATA FLOAT WHEN WRITE 
CYCLE IS FOLLOWED BY READ OR IDLE 


@ MIN. |MAX 14) TIN MAX 


240187-—40 


Figure 7.6. A.C. Timing Waveforms—Output Float Delay and HLDA Valid Delay Timing 


<+—— RESET ——>|+—_—_———_ INITIALIZATION SEQUENCE —————> 
$2 OR $1 $2 OR ¢1 


240187-41 


Figure 7.7. A.C. Timing Waveforms—RESET Setup and Hold Timing and Internal Phase 
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“- -—- 
a ”" 
i Cc 
4 et 
5 5 
uJ J 
a a 
a a 
z < 
> > 
os e 
i a 
E = 
paw peo ] 
re) ro) 


50 75 100 125 150 | 75 100 125 4150 


C, (picofarads) | C, (picofarads) 
240187-42 240187-43 
Figure 7.8. Typical Output Valid Delay versus | Figure 7.9. Typical Output Valid Delay versus 
Load Capacitance at Maximum Operating Load Capacitance at Maximum Operating 
Temperature (C; = 120 pF) | Temperature (C, = 75 pF) 


RISE TIME (ns) 0.8V = 2.0V 


75 100 125 150 


C, (picofarads) 
240187-50 — 


Figure 7.10. Typical Output Rise Time versus 
Load Capacitance at Maximum Operating 
Temperature 


Typical log 


CHMOS Il 


12 14 16 18 20 
Clock Speed (MHz) 


240187-45 


Figure 7.11. Typical icc vs Frequency 
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Figure 7.13. Preliminary ICET™4-386 SX Emulator User Cable with OIB and PQFP Adapter 
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7.5 Designing for ICET™-386 SX 
Emulator (Advanced Data) 


The 386 SX CPU’s in-circuit emulator product is the 
ICE™-386 SX emulator. If your ICETM system is not 
equipped to use on circuit emulation use of the emu- 
lator requires the target system to provide a socket 
that is compatible with the ICE-386 SX emulator. 
The ICE-386 SX offers a 100-pin fine pitch flat-pack 
probe for emulating user systems. The 100-pin fine 
pitch flat-pack probe requires a socket, called the 
100-pin PQFP, which is available from 3M text-tool 
(part number 2-0100-07243-000). The ICE-386 SX 
emulator probe attaches to the target system via an 
adapter which replaces the 386 SX CPU component 
in the target system. Because of the high operating 
frequency of 386 SX CPU systems and of the ICE- 
386 SX emulator, there is no buffering between the 
386 SX CPU emulation processor in the ICE-386 SX 
emulator probe and the target system. A direct result 
of the non-buffered interconnect is that the ICE-386 
SX emulator shares the address and data bus with 
the user’s system, and the RESET signal is inter- 
cepted by the ICE emulator hardware. In order for 
the ICE-386 SX emulator to be functional in the us- 
er’s system without the Optional Isolation Board 
(OIB) the designer must be aware of the following 
conditions: 


1. The bus sontrollee must only enable data trans- 
ceivers onto the data bus during valid read cycles 
of the 386 SX CPU; other local devices or other 
bus masters. 


2. Before another bus master drives the local proc- 


essor address bus, the other master must gain . 


control of the address bus by asserting HOLD and 
receiving the HLDA response. 


3. The emulation processor receives the RESET sig- 
nal 2 or 4 CLK2 cycles later than an 386 SX CPU 
would, and responds to RESET later. Correct 
phase of the response is guaranteed. 


In addition to the above considerations, the 
\CE-386 SX emulator processor module has several 
electrical and mechanical characteristics that should 
be taken into consideration when designing the 386 
SX CPU system. 


Capacitive Loading: ICE-386 SX adds up to 27 pF. 


to each 386 SX CPU signal. 


Drive Requirements: ICE-386 SX adds one FAST 
TTL load on the CLK2, control, address, and data 
lines. These loads are within the processor module 
and are driven by the 386 SX CPU emulation proces- 
sor, which has standard drive and loading capability 
listed in Tables 7.3 and 7.4. 


Power Requirements: For noise immunity and 
CMOS latch-up protection the ICE-386 SX emulator 
processor module is powered by the user system. 
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The circuitry on the processor module draws up to 
1.4A including the maximum 386 SX CPU Icc from 
the user 386 SX CPU socket. 


386 SX CPU Location and Orientation: The 
ICE-386 SX emulator processor module may require 
lateral clearance. Figure 7.12 shows the clearance 
requirements of the iMP adapter. The optional isola- 
tion board (OIB), which provides extra electrical buff- 
ering and has the same lateral clearance require- 
ments as Figure 7.12, adds an additional 0.5 inches 
to the vertical clearance requirement. This is illus- 
trated in Figure 7.13. 


Optional Isolation Board (OIB) and the CLK2 
speed reduction: Due to the unbuffered probe de- 
sign, the ICE-386 SX emulator is susceptible to er- 
rors on the user’s bus. The OIB allows the ICE-386 
SX emulator to function in user systems with faults 
(shorted signals, etc.). After electrical verification the 
OIB may be removed. When the OIB is installed, the 
user system must have a maximum CLK2 frequency 
of 20 MHz. 


8.0 DIFFERENCES BETWEEN THE 


386 SX CPU AND THE 386 oe 
CPU 


The following are the major differences between the 
386 SX CPU and the 386 DX CPU: 


1. The 386 SX CPU generates byte selects on 
~BHE# and BLE# (like the 8086 and 80286) to 
distinguish the upper and lower bytes on its 16-bit 
data bus. The 386 DX CPU uses four byte selects, 
BEO #-BE3#, to distinguish between the different 
bytes on its 32-bit bus. 


2. The 386 SX CPU has no bus sizing option. The 
386 DX CPU can select between either a 32-bit 
bus or a 16-bit bus by use of the BS16# input. 
The 386 SX CPU has a 16-bit bus size. 


3. The NA# pin operation in the 386 SX CPU is 


identical to that of the NA# pin on the 386 DX 
CPU with one exception: the 386 DX CPU NA# 
pin cannot be activated on 16-bit bus cycles 
(where BS16# is LOW in the 386 DX CPU case), 
whereas NA# can be activated on any 386 SX 
CPU bus cycle. 


4. The contents of all 386 SX CPU registers at reset 
are identical to the contents of the 386 DX CPU 
registers at reset, except the DX register. The DX 
register contains a component-stepping identifier 
at reset, i.e. 


in 386 DX CPU, DH = 3 indicates 386 DX CPU 
after reset - 


DL = revision number; 


in 386 SX CPU, DH 23H indicates 386 SX 
after reset CPU 
DL = revision number. 
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5. The 386 DX CPU uses A3; and M/IO# as selects 
for the numerics coprocessor. The 386 SX CPU 
uses Ao3 and M/IO# as selects. 


6. The 386 DX CPU prefetch unit fetches code in 
four-byte units. The 386 SX CPU prefetch unit 
reads two bytes as one unit (like the 80286). In 
BS16 mode, the 386 DX CPU takes two consecu- 
tive bus cycles to complete a prefetch request. If 
there is a data read or write request after the pre- 
fetch starts, the 386 DX CPU will fetch all four 
bytes before addressing the new request. 


7. Both 386 DX CPU and 386 SX CPU have the 
same logical address space. The only difference 
is that the 386 DX CPU has a 32-bit physical ad- 
dress space and the 386 SX CPU has a 24-bit 
physical address space. The 386 SX CPU has a 
physical memory address space of up to 16 
megabytes instead of the 4 gigabytes available to 
the 386 DX CPU. Therefore, in 386 SX CPU sys- 
tems, the operating system must be aware of this 
physical memory limit and should allocate memo- 
ry for applications programs within this limit. If a 
386 DX CPU system uses only the lower 16 
megabytes of physical address, then there will be 
no extra effort required to migrate 386 DX CPU 
software to the 386 SX CPU. Any application 
which uses more than 16 megabytes of memory 
can run on the 386 SX CPU if the operating sys- 
tem utilizes the 386 SX CPU’s paging mechanism. 
In spite of this difference in physical address 
space, the 386 SX CPU and 386 DX CPU can run 
the same operating systems and applications 
within their respective physical memory con- 
straints. | 


8. The 386 SX has an input called FLT # which tri- 
states all bidirectional and output pins, including 
HLDA#, when asserted. It is used with ON Circuit 
Emulation (ONCE). 


9.0 INSTRUCTION SET 


This section describes the instruction set. Table 9.1 
lists all instructions along with instruction encoding 
diagrams and clock counts. Further details of the 
instruction encoding are then provided in the follow- 
ing sections, which completely describe the encod- 
ing structure and the definition of all fields occurring 
within instructions. | 


9.1 386 SX CPU Instruction Encoding 
and Clock Count Summary 


To calculate elapsed time for an instruction, multiply 
the instruction clock count, as listed in Table 9.1 be- 
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low, by the processor clock period (e.g. 62.5 ns for 
an 386 SX Microprocessor operating at 16 MHz). 
The actual clock count of an 386 SX Microprocessor 
program will average 5% more than the calculated 
clock count due to instruction sequences which exe- 
cute faster than they can be fetched from memory. 


Instruction Clock Count Assumptions 


1. The instruction has been prefetched, decoded, 
and is ready for execution. 


2. Bus cycles do not require wait states. 


3. There are no local bus HOLD requests delaying 
processor access to the bus. 


4. No exceptions are detected during instruction ex- 
ecution. 


5. lf an effective address is calculated, it does not 
use two general register components. One regis- 
ter, scaling and displacement can be used within 
the clock counts shown. However, if the effective 
address calculation uses two general register 
components, add 1 clock to the clock count 
shown. 


Instruction Clock Count Notation 


1. If two clock counts are given, the smaller refers to 
a register operand and the larger refers to a mem- 
ory operand. | 


2. nN = number of times repeated. 


3. m = number of components in the next instruc- 
tion executed, where the entire displacement (if 
any) Counts as one component, the entire imme- 
diate data (if any) counts as one component, and 
all other bytes of the instruction and prefix(es) 
each count as one component. 


Misaligned or 32-Bit Operand Accesses 


— if instructions accesses a misaligned 16-bit oper- 
and or 32-bit operand on even address add: 
2* clocks for read or write 
4** clocks for read and write 


— if instructions accesses a 32-bit operand on odd 
address add: 
4* clocks for read or. write 
8** clocks for read and write 


Wait States 


Wait states add 1 clock per wait state to instruction 
execution for each data access. 
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Table 9-1. Instruction Set Clock Count Summary 


_ CLOCK COUNT NOTES 


Real Real 
INSTRUCTION _ FORMAT Address | Protected | Address | Protected 
Yd “+s -Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode — 8086 Mode 
Mode Mode 
GENERAL DATA TRANSFER 
MOV = Move: 
Register to Register/Memory 1000100w 
Register/Memory to Register 1000101w 
immediate to Register/Memory 1100011w | mod000_ 1r/m} immediate data 
Immediate to Register (short form) 1011w reg | immediate data 
Memory to Accumulator (short form) 1010000w | fulldisplacement 
Accumulator to Memory (short form) full displacement 
Register Memory to Segment Register 10001110 |modsreg3 r/m 


Segment Register to Register/Memory 10001100 |modsreg3 r/m 


MOVSX = Move With Sign Extension 
“Ott tw 


Register From Register/Memory 00001111 


MOVZX = Move With Zero Extension 


1041011w 


Register From Register/Memory _ 00001111 
PUSH = Push: , 7 
Register/Memory 11111111 .}mod110 r/m 
Register (short form) 01010 reg 
Segment Register (ES, CS, SS or DS) 

(short form) 000sreg2110 
| FS or GS) 00001111 10sreg3 000 
immediate 011010s0 | immediate data 
PUSHA = Push All ; 
POP = Pop 
Register/Memory 10001111 |mod000 r/m 
Register (short form) 01011 reg 
Segment Register (ES, CS, SS or DS) 

(short form) 000sreg2111 
Segment Register (ES, CS, SS or DS), 

ES or GS 00001111 10sreg3001 
POPA = Pop All _ 01100001 


XCHG = Exchange 


Register/Memory With Register 1000011w |modreg- = r/m 


Clk Count 


Virtual 
IN = Input from: 8086 Mode 


Register With Accumulator (short form) 10010 re 


~ 


Fixed Port | 1110010w | portnumber 6*/26* 


Variable Port 1110110wW 7*/27* 
OUT = Output to: 
Fixed Port 1110011w port number 


Variable Port 1110111wW 


LEA = Load EA to Register 10001101 |modreg = r/m 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real 
INSTRUCTION FORMAT Protected Address Protected 
Virtual Mode or Virtual 
Address Virtual Address 
Mode 8086 Mode 
Mode 


SEGMENT CONTROL 


LDS = Load Pointer to DS 11000101 | modreg == r/m 26*/28* 


LES = Load Pointer to ES 11000100 | modreg r/m 26*/28* 


LFS = Load Pointer to FS -{ 00001111 10110100 | modreg r/m 29*/31* 


LGS = Load Pointer to GS 00001111 10110101 | modreg r/m 26*/28* 


10110010 26*/28* 


LSS = Load Pointer to SS 00001111 


FLAG CONTROL 
CLC = Clear Carry Flag 11111000 
CLD = Clear Direction Flag 11111100 


CLI = Clear Interrupt Enable Flag 11111010 


CLTS = Clear Task Switched Flag 00001111 00000110 


CMC = Complement Carry Flag 11110101 
LAHF = Load AH into Fiag 10011111 
POPF = Pop Flags 10011101 
PUSHF = Push Flags 10011100 
SAHF = Store AH into Flags 10011110 
STC = Set Carry Flag 11111001 
STD = Set Direction Flag 11111101 
STi = Set interrupt Enable Flag 11111011 


ARITHMETIC 
ADD = Add 


Register to Register 000000dw | modreg-s r/m 


Register to Memory 0000000w | modreg r/m 


Memory to Register 0000001w | modreg r/m 


Immediate to Register/ Memory 100000sw | mod000 f/m] immediate data 


Immediate to Accumulator (short form) 0000010w immediate data 


ADC = Add With Carry 


Register to Register 000100dw |modreg- = r/m 


Register to Memory 0001000w | modreg~ r/m 


Memory to Register 0001001w | modreg == r/m 


Immediate to Register/ Memory 100000sw |mod010 1r/m] immediate data 


Immediate to Accumulator (short form) 0001010w immediate data 


INC = Increment 
Register/Memory 1111111w |mod000 r/m 


Register (short form) 01000 reg 


SUB = Subtract 


Register from Register 001010dw 


mod reg r/m 


5-941 


intel 


386™ SX MICROPROCESSOR 


Table 9-1. Instruction Set Clock Count Summary (Continued) 


- CLOCK COUNT _ NOTES. 


: yo Real Real 
INSTRUCTION FORMAT Address Protected Address Protected 
a o> Mode or Virtual Mode or Virtual 
Virtuai Address | Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 
ARITHMETIC (Continued) 
Register from Memory 0010100w 7** , ds 


0010101w|modreg _#/m |, 


emory from Register 


Immediate from Register/Memory | 100000sw |mod101  1/m| immediate data 2/7** 2/7** 
Immediate from Accumulator (short form) 0010110w immediate data 2 2 


SBB = Subtract with Borrow 


000110dw|modreg _r/m ; 2 2 
0001100w|modreg _t/m i 
0001101w |modreg _r/m 6° 6° 


Register from Register 
Register from Memory 


emory from Register 


Immediate from Register/Memory immediate data 2/7** 2/7** 
Immediate from Accumulator (short form) immediate data 2 2 
DEC = Decrement 

Register/Memory 2/6 2/6 


Register (short form) 01001 reg 2 2 
MP = Compare | 
Register with Register 001110dw 2 2 


0011100w |modreg __r/m 5° 5° 
0011101w |modreg _r/m 6 6° 


Memory with Register 


Register with Memory 


Immediate with Register/Memory immediate data 2/5* 2/5* 
Immediate with Accumulator (short form) immediate data 2 | 2 . 
EG = Change Sign | 2/6* 2/6* 
AAA = ASCII Adjust for Add 4 4 
AAS = ASCII Adjust for Subtract 4 4 
[PAA = Decimal Adjust for Add 4 4 
DAS = Decimal Adjust for Subtract 4 4 


MUL = Multiply (unsigned) 


Accumulator with Register/Memory 
Multiplier-Byte 
-Word 
_ -~Doubleword 
MUL =<integer Multiply (signed) 
Accumulator with Register/Memory 
Multiplier-Byte 
-Word 
-Doubleword 


Megister with Register/Memory 


Multiplier-Byte 
-Word 
-Doubleword 


Register/Memory with Immediate to Register} 011010s1 immediate data 


-Word 
-Doubleword 


12-17/15-20* 
12-25/15-28* 
12-41/17-46* 


12-17/15-20* 
12-25/15-28* 
12-41/17-46" 


|o0001111| 10101111 
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12-17/15-20" 
12-25/15-28* 
12-41/17-46* 


12-17/15-20* 
12-25/15-28* 
12-41/17-46* 


12-17/15-20* 
12-25/15-28* 
12-41/17-46* 


12-17/15-20* 
12-25/15-28* 
12-41/17-46* 


13~26/14-27 
13-42/16-45 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


INSTRUCTION FORMAT Address | Protected | Address | Protected 
Virtual Virtual 

Address Address 

Mode Mode 


ARITHMETIC (Continued) 
DIV = Divide (Unsigned) 


Accumulator by Register/Memory 1111011wWw jmod110 r/m 


Divisor—Byte 
—Word 
—Doubleword 


IDIV = Integer Divide (Signed) 
Accumulator By Register/Memory 1111011w |mod111 = r/m 


Divisor—Byte 
—Word 
—Doubleword 


AAD = ASCIl Adjust for Divide 11010101 | 00001010 


00001010 


AAM = ASCII Adjust for Multiply 11010100 
CBW = Convert Byte to Word 10011000 


CWD = Convert Word to Double Word; 10011001 


LOGIC 


Shift Rotate Instructions 
Not Through Carry (ROL, ROR, SAL, SAR, SHL, and SHR) 


Register/Memory by 1 1101000w 


r/ 


mod 


[1101000w |mod TTT _r/m 
[1101001w |mod TTT _r/m 


Register/Memory by CL 1101001w |mod 
Register/Memory by Immediate Count | 1100000w {mod TTT  r/mjimmed 8-bit data 


Through Carry (RCL and RCR) 


IE 
5 


Register/Memory by 1 1101000w |mod 


q 
Register/Memory by CL 1101001w |mod 


: 
5 


Register/Memory by Immediate Count | 1100000w |modTTT  r/mlimmed 8-bit data 


TTT Instruction 
000 

001 

010 

011 

100 #SHL/SAL 
101 SHR 
111. SAR 

SHLD = Shift Left Double 


Register/Memory by Immediate 00001111 | 10100100 immed 8-bit data 


10100104 
10101100 modreg _r/mlimmed 8-bit data 


Register/Memory by CL 00001111) 10101101 |modreg r/m 


Register/Memory by CL 00001111 


SHRD = Shift Right Double 
Register/Memory by Immediate 00001111 


AND = And 
Register to Register 001000dw 


mod reg r/m 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real 
INSTRUCTION FORMAT Address | Protected | Address | Protected 
re ee a ay Modeor | Virtual Virtual 
Virtual | Address Address 


8086 Mode Mode 


Mode 


LOGIC (Continued) 


0010000w |modreg = r/m 


BS) 
a 
«a 
a 
® 
x 
o 
= 
ce) 
3 
° 
< 


Memory to Register | 0010001w |modreg = r/m 


Immediate to Register/Memory ‘1000000w |mod100_ r/mj{ immediate data 


immediate to Accumulator (Short Form) 0010010w | immediate data 


TEST = And Function to Flags, No Result 
Register/Memory and Register -1000010w j|modreg r/m 


mod000_ r/mj{ immediate data 


Immediate Data and Register/Memory 1111011w 


Immediate Data and Accumulator 
(Short Form) . 1010100w | immediate data 


OR = Or 
Register to Register 000010dw |modreg r/m 


Register to Memory 0000100w |modreg r/m 


Memory to Register . 0000101w |modreg = r/m 


Immediate to Register/Memory 1000000w |mod001_ r/m| immediate data 


immediate to Accumulator (Short Form) 0000110w | immediate data 
XOR = Exclusive Or 
Register to Register 001100dw |modreg r/m 


Register to Memory | ‘0011000w |modreg r/m 


Memory to Register . | 0011001w |modreg r/m 


immediate to Register/Memory 1000000w |mod110 = r/m| immediate data 


Immediate to Accumulator (Short Form) 0011010w | immediate data 


NOT = Invert Register/Memory 1111011w jmod010 r/m 


_|STRING MANIPULATION 


CMPS = Compare Byte Word 1010011w 
INS = Input Byte/Word from DX Port 0110110Ww 9*/29** 


LODS = Load Byte/Word to AL/AX/EAX| 1010110w 5* 


MOVS = Move Byte Word 1010010w | 7 


OUTS = Output Byte/Word to DX Port | 0110111w g*/28" 


SCAS = Scan Byte Word 1010111w ia 


STOS = Store Byte/Word from 
AL/AX/EX . 1010101w 


|XLAT = Translate String 11010111 | 


REPEATED STRING MANIPULATION 
Repeated by Count in CX or ECX _ 

REPE CMPS = Compare String 
(Find Non-Match) 


11110011 | 1010011wW~ 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real 
INSTRUCTION FORMAT Address 
Mode or Virtual Virtual 
Virtual Address Address 
8086 Mode Mode 
Mode 


REPEATED STRING MANIPULATION (Continued) 


REPNE CMPS = Compare String Cik Count 


(Find Match) | 11110010]1010011wW hae wands 5+9n** 5+9n** 


0110110w 13+6n* 7+6n*/ s/t, h,m 
27+6n* 


REP INS = Input String 11110010 


REP LODS = Load String 11110010}1010110w 5+6n* h 
REP MOVS = Move String -111110010)/1010010W 7+ 4n** h 


REP OUTS = Output String 1711100101/0110111Ww 6+ 5n*/ s/t, h,m 
26+ 5n* 


REPE SCAS = Scan String 
(Find Non-AL/AX/EAX) 11110011/1010111wW 


REPNE SCAS = Scan String 


(Find AL/AX/EAX) 11110010;1010111w 


REP STOS = Store String 11110010;,1010101Ww 


BIT MANIPULATION 


BSF = Scan Bit Forward 00001111 {10111100 jmodreg r/m 10+3n** 


BSR = Scan Bit Reverse 00001111 ],10111101 jmodreg r/m 10+3n**- 


BT = Test Bit 
Register/Memory, Immediate 00001111]10111010 |mod100 = r/miimmed 8-bit data 


Register/Memory, Register 00001111{10100011 imodreg r/m 


BTC = Test Bit and Complement 
Register/Memory, Immediate 00001111 {10111010 jmod111  r/miimmed 8-bit data 


10111011 


Register/Memory, Register 00001111 mod reg r/m 


BTR = Test Bit and Reset 
Register/Memory, Immediate 00001111 ]10111010 jmod110 = r/miimmed 8-bit date 


Register/Memory, Register 00001111{|10110011 jmodreg r/m 


BTS = Test Bit and Set 
Register/Memory, immediate 00001111 )]10111010 |jmod101 = = r/miimmed 8-bit data 


Register/Memory, Register 00001111 ),10101011 |modreg r/m 


CONTROL TRANSFER 
CALL = Call 
Direct Within Segment 11101000 | full displacement 


Register/Memory 
Indirect Within Segment 11114111 


mod010 = r/m 7+m*/10+ 


Direct Intersegment 10011010 junsigned full offset, selector 


NOTE: 
+ Clock count shown applies if 1/O permission allows I/O to the port in virtual 8086 mode. If |/O bit map denies permission 
exception 13 fault occurs; refer to clock counts for INT 3 instruction. 
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Table 9-1. Instruction Set Clock Count Summary (Continued) | 


CLOCK COUNT NOTES 


INSTRUCTION FORMAT Protected | Address | Protected 
, Virtual Virtual 
Address Address 
Mode Mode 


ONTROL TRANSFER (Continued) 
Protected Mode Only (Direct Intersegment) 


Via Call Gate to Same Privilege Level 64+m | .  hj.kr 
Via Cail Gate to Different Privilege Level, __ 

(No Parameters) 98+m h,j,K,t 
Via Call Gate to Different Privilege Level, 

(x Parameters) . 106+8x+m h,j, kr 
From 286 Task to 286 TSS - 285 hjke 
From 286 Task to 386™ SX CPU TSS 310 h,j.kr 
From 286 Task to Virtual 8086 Task (386 SX CPU TSS) . 229 h,j,kr 
From 386 SX CPU Task to 286 TSS 285 h,j,kr 
From 386 SX CPU Task to 386 SX CPU TSS 392 h,j,kr 
From 386 SX CPU Task to Virtual 8086 Task (386 SX CPU TSS) ; h,j,.k,r 


Indirect Intersegment 117111111 jmod011 r/m . h,j, kr 


Protected Mode Only (Indirect intersegment) 
Via Call Gate to Same Privilege Level h,j,k,r 
Via Call Gate to Different Privilege Level, ; 

(No Parameters) . . : . ; hike 
Via Call Gate to Different Privilege Level, 

(x Parameters) h,j.k,r 
From 286 Task to286TSS > | | hjkr 
From 286 Task to 386 SXCPUTSS _ : | hjkir 
From 286 Task to Virtual 8086 Task (386 SX CPU TSS) . . | h,j,k,r 
From 386 SX CPU Task to 286 TSS h,j,.k,r 
From 386 SX CPU Task to 386 SX CPU TSS h,j,k,r 
From 386 SX CPU Task to Virtual 8086 Task (386 SX CPU TSS) . = h,j,k,r 


WMP = Unconditional Jump 
11101011 |8-bit displacement 
Direct within Segment full displacement 
pegiston Memon Inner! 41111111 |mod100 = r/m 
ithin Segment - : P 
Direct Intersegment 11101010 junsigned full offset, selector ‘ . j.kr 


rotected Mode Only (Direct Intersegment) 7 
Via Call Gate to Same Privilege Level h,j,k,r 
From 286 Task to 286 TSS hike 
From 286 Task to 386 SX CPU TSS . h,j,k.r 
From 286 Task to Virtual 8086 Task (386 SX CPU TSS) hj. kr 
From 386 SX CPU Task to 286 TSS h,j,k,r 
From 386 SX CPU Task to 386 SX CPU TSS h,j,k,¢ 
From 386 SX CPU Task to Virtual 8086 Task (386 SX CPU TSS) 3 h,j,.k,r 


ndirect Intersegment 11111111 |mod101 r/ - “4 h,j.k,r 


Protected Mode Only (Indirect Intersegment) . 
Via Call Gate to Same Privilege Level h,j,k,r 
From 286 Task to 286 TSS : hj.k,r 
From 286 Task to 386 SX CPU TSS . . | hj. kr 
From 286 Task to Virtual 8086 Task (386 SX:CPU TSS) h,j, kr 
From 386 SX CPU Task to 286 TSS | | | hyjk,r 
From 386 SX CPU Task to 386 SX CPU TSS | is 23 hj, kr 


From 386 SX CPU Task to Virtual 8086 Task (386 SX CPU TSS) h,j,k,r 
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Table 9-1. instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


CONTROL TRANSFER (Continued) 
RET = Return from CALL: 


Within Segment 11000011 


Within Segment Adding Immediate to SP 11000010 16-bit displ 


Intersegment 11001011 


Intersegment Adding Immediate to SP 11001010 16-bit displ 


Protected Mode Only (RET): 
to Different Privilege Level 
Intersegment 
Intersegment Adding Immediate to SP 


CONDITIONAL JUMPS 
NOTE: Times Are Jump ‘Taken or Not Taken” 
JO = Jump on Overfiow 


8-Bit Displacement 01110000 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10000000 | full displacement 7+mor3| 7+mor3 


JNO = Jump on Not Overflow 
8-Bit Displacement 01110001 8-bit displ 7+mor3; 7+mor3 


Full Displacement 00001111 10000001 | full displacement 7+mor3| 7+mor3 


JB/JNAE = Jump on Below/Not Above or Equal 
8-Bit Displacement 01110010 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10000010 | full displacement 7+mor3| 7+mors | 


JNB/JAE = Jump on Not Beiow/Above or Equal 
8-Bit Displacement 01110011 


8-bit displ 7+mor3} 7+mor3 


Full Displacement 00001111 | 10000011 | full displacement 7+mor3| 7+mor3 


JE/JZ = Jump on Equal/Zero 


8-Bit Displacement 01110100 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10000100 | full displacement 7+mor3}] 7+mor3 


JNE/JNZ = Jump on Not Equal/Not Zero 
8-Bit Displacement 01110101 8-bit displ 7+mor3{| 7+mor3 


10000101 | full displacement 7+mor3| 7+mor3 


Full Displacement 00001111 


JBE/JNA = Jump on Below or Equal/Not Above 
8-Bit Displacement 01110110 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10000110 | full displacement 7+mor3} 7+mor3 


JNBE/JA = Jump on Not Below or Equal/Above . 
8-Bit Displacement 01110111 8-bit disp! 7+mor3|] 7+mor3 


Fuli Displacement 00001111 10000111 | full displacement 7+mor3| 7+mor3 
JS = Jump on Sign 
8-Bit Displacement 01111000 8-bit displ 7+mor3}| 7+mor3 


Full Displacement 00001111 


10001000 | full displacement 7+mor3|] 7+mor3 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode — 8086 Mode 
Mode Mode 


CONDITIONAL JUMPS (Continued) 
JNS = Jump on Not Sign . 
8-Bit Displacement 01111001 8-bit displ 7+mor3}| 7+mor3 


10001001 | full displacement 7+mor3} 7+mor3 


Full Displacement 00001111 


JP/JPE = Jump on Parity/Parity Even 
8-Bit Displacement 01111010 8-bit displ 7+mor3]| 7+mor3 


Full Displacement | 00001111 10001010 | full displacement 7+mor3| 7+mor3 


JNP/JPO = Jump on Not Parity/Parity Odd 
8-Bit Displacement . 01111011 


8-bit displ 7+mor3} 7+mor3 


Full Displacement 00001111 10001011 | fulldisplacement 7+mor3} 7+mor3 


JL/JNGE = Jump on Less/Not Greater or Equal 
8-Bit Displacement 01111100 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10001100 | full displacement ‘7+mor3| 7+mor3 


JNL/JGE = Jump on Not Less/Greater or Equal 
8-Bit Displacement . 01111101 8-bit displ 7+mor3 |; 7+mor3 


Full Displacement 00001111 10001101 | full displacement 7+mor3 | 7+mor3 


JLE/JNG = Jump on Less or Equal/Not Greater 


8-Bit Displacement 01111110 8-bit displ 7+mor3| 7+mor3 


Full Displacement 00001111 10001110 | full displacement 7+mor3] 7+mor3 


JNLE/JG = Jump on Not Less or Equal/Greater 
8-Bit Displacement: 01111111 8-bit disp! 7+mor3}| 7+mor3 


Full Displacement — 00001111 100011141 | full displacement 7+mor3| 7+mor3> 


JCXZ = Jump on CX Zero 11100011 8-bitdispl —|9+mor5| 9+mor5 


JECXZ = Jump on ECX Zero 11100011 


8-bit displ 9+mor5| 9+mor5 


(Address Size Prefix Differentiates JCXZ from JECXZ) 


LOOP = Loop CX Times 11100010 8-bit disp! | 114m 


LOOPZ/LOOPE = Loop with 


Zero/Equal 11100001 8-bit displ 


LOOPNZ/LOOPNE = Loop While 
Not Zero 11100000 8-bit displ 


CONDITIONAL BYTE SET 
NOTE: Times Are Register/Memory 


SETO = Set Byte on Overflow 
To Register/Memory | 00001111 | 10010000 |mod000_ r/m 


SETNO = Set Byte on Not Overflow 
To Register/Memory 00001111 


10010001 |mod000 = r/m] 


SETB/SETNAE = Set Byte on Below/Not Above or Equal 
- To Register/Memory | 00001111 | 10010010 |mod000_ f/m 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real . 
INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


CONDITIONAL BYTE SET (Continued) 
SETNB = Set Byte on Not Below/Above or Equal 
To Register/Memory 


SETE/SETZ = Set Byte on Equal/Zero 
To Register/Memory 00001111 


SETNE/SETNZ = Set Byte on Not Equal/Not Zero 
To Register/Memory } 00001111 10010101 


SETBE/SETNA = Set Byte on Below or Equal/Not Above 


To Register/Memory | 00001111 10010110 |mod000 f/m 


SETNBE/SETA = Set Byte on Not Below or Equal/Above \ 


To Register/Memory | 00001111 10010111 |mod000 r/m 


SETS = Set Byte on Sign 


To Register/Memory 00001111 10011000 | mod000 f/m 


SETNS = Set Byte on Not Sign 
To Register/Memory 00001111 


SETP/SETPE = Set Byte on Parity/Parity Even 
To Register/Memory 00001111 mod000 f/m 


SETNP/SETPO = Set Byte on Not Parity/Parity Odd | 


To Register/Memory | 00001111 10011011 |mod000 r/m 


SETL/SETNGE = Set Byte on Less/Not Greater or Equal 


To Register/Memory | 00001111 | 10011100 |mod000 r/m 


SETNL/SETGE = Set Byte on Not Less/Greater or Equal 
To Register/Memory | 00001111 01111101 |mod000 r/m 


SETLE/SETNG = Set Byte on Less or Equal/Not Greater 


To Register/Memory | 00001111 10011110 |mod000 f/m 


SETNLE/SETG = Set Byte on Not Less or Equal/Greater 


To Register/Memory | 00001111 10011111 | mod000 f/m 


ENTER = Enter Procedure 11001000 


L=0 
Lt 
L> 1 


LEAVE = Leave Procedure 11001001 


5-949 


Intel 386™ SX MICROPROCESSOR 


Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


7 . . Real Real 
INSTRUCTION FORMAT Address | Protected | Address | Protected 
. Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


| INTERRUPT INSTRUCTIONS 
INT = Interrupt: 


Type Specified 
Type 3 


INTO = Interrupt 4 if Overflow Flag Set 


IfOF = 1 
lfOF = 0 
Bound = Interrupt 5 if Detect Value 01100010 
Out of Range 


If Out of Range e, g, h, j, kr 

If In Range e, g, h, j,k, r 
Protected Mode Only (INT) 

INT: Type Specified 


Via Interrupt or Trap Gate 
Via Interrupt or Trap Gate 
to Same Privilege Level 
to Different Privilege Level 
From 286 Task to 286 TSS via Task Gate 
From 286 Task to 386™ SX CPU TSS via Task Gate 
From 286 Task to virt 8086 md via Task Gate 
From 386™ SX CPU Task to 286 TSS via Task Gate 
From 386™ SX CPU Task to 386™ SX CPU TSS via Task Gate 
From 386™ SX CPU Task to virt 8086 md via Task Gate | 
From virt 8086 md to 286 TSS via Task Gate 
From virt 8086 md to 386™ SX CPU TSS via Task Gate 
From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate 


INT: TYPE 3 
Via Interrupt or Trap Gate 
to Same Privilege Level 
Via interrupt or Trap Gate 
to Different Privilege Level 
From 286 Task to 286 TSS via Task Gate 
From 286 Task to 386™ SX CPU TSS via Task Gate 
From 286 Task to Virt 8086 md via Task Gate 
From 386™ SX CPU Task to 286 TSS via Task Gate 
From 386™ SX CPU Task to 386™ SX CPU TSS via Task Gate 
From 386™ SX CPU Task to Virt 8086 md via Task Gate 
From virt 8086 md to 286 TSS via Task Gate 
From virt 8086 md to 386™ SX CPU TSS via Task Gate 
From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate 


INTO: 


Via Interrupt or Trap Grate 
to Same Privilege Level 

Via Interrupt or Trap Gate 
to Different Privilege Level 

From 286 Task to 286 TSS via Task Gate 

From 286 Task to 386™ SX CPU TSS via Task Gate 

From 286 Task to virt 8086 md via Task Gate 

From 386™ SX CPU Task to 286 TSS via Task Gate 

From 386T SX CPU Task to 386™ SX CPU TSS via Task Gate 

From 386™ SX CPU Task to virt 8086 md via Task Gate 

From virt 8086 md to 286 TSS via Task Gate 

From virt 8086 md to 386™ SX CPU TSS via Task Gate 

From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate 


g, j,k, r 


g,j,k,r 
g, i,k, r 
g,j,k,r 
g,j,k,r 
g,j,k,r 
g,j, kr 
g, j,k, r 
g,j,k,r 
g, j,k, r 
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Table 9-1. Instruction Set Clock Count Summary (Continued) 


INSTRUCTION FORMAT 


INTERRUPT INSTRUCTIONS (Continued) 
BOUND: 


Via Interrupt or Trap Gate 
to Same Privilege Level 
Via Interrupt or Trap Gate 
to Different Privilege Level 
From 286 Task to 286 TSS via Task Gate 
From 286 Task to 386™ SX CPU TSS via Task Gate 
From 268 Task to virt 8086 Mode via Task Gate 
From 386 SX CPU Task to 286 TSS via Task Gate 
From 386 SX CPU Task to 386 SX CPU TSS via Task Gate 
From 386 SX CPU Task to virt 8086 Mode via Task Gate 
From virt 8086 Mode to 286 TSS via Task Gate 
From virt 8086 Mode to 386 SX CPU TSS via Task Gate 
From virt 8086 md to priv level 0 via Trap Gate or Interrupt Gate 


INTERRUPT RETURN 
IRET = Interrupt Return 11001111 


Protected Mode Only (IRET) 
To the Same Privilege Level (within task) 
To Different Privilege Level (within task) 
From 286 Task to 286 TSS 
From 286 Task to 386 SX CPU TSS 
From 286 Task to Virtual 8086 Task 
From 286 Task to Virtual 8086 Mode (within task) 
From 386 SX CPU Task to 286 TSS 
From 386 SX CPU Task to 386 SX CPU.TSS 
From 386 SX CPU Task to Virtual 8086 Task 
From 386 SX CPU Task to Virtual 8086 Mode (within task) 


PROCESSOR CONTROL 


HLT = HALT 11110100 


MOV = Move to and From Control/Debug/Test Registers 


CLOCK COUNT NOTES 


Real Real 
Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


g, j, k, r 


g,j,k,r 
g, j,k, r 
g,j, kr 
g, j,k, r 
g,j,k,r 
g, j,k, r 
g, j,k, 1, 
g,j,k,r 
g,j, kr 


g, hy j, kt 


g, h, j, k, r 
g, h, j, k, r 
h, j, k,r 
h,j, kr 
h, j, k, r 


h, j,k, r 
h, j, k, ¢ 
h, j, k, r 


CRO/CR2/CR3 from register 00001111 
Register From CRO-3 00001111 
DRO-3 From Register 00001111 
DR6-7 From Register | 00001111 
Register from DR6-7 00001111 
Register from DRO-3 00001111 
TR6-7 from Register 00001111 
Register from TR6-7 00001111 


NOP = No Operation 10010000 


WAIT = Wait until BUSY # pin is negated | 10011011 
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Table 9-1. Instruction Set Clock Count Summary (Continued) | 


CLOCK COUNT _ NOTES 


ae a . Real Real 
INSTRUCTION... FORMAT Address Protected Address Protected 
- ee dH Mode or Virtual Mode or — Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


| PROCESSOR EXTENSION INSTRUCTIONS 


Processor Extension Escape ~1$41044TTT | modLLL r/m See h 
TTT and LLL bits are opcode 387SX 
information for coprocessor. data sheet for 
clock counts 
PREFIX BYTES 
Address Size Prefix 01100111 0 
LOCK = Bus Lock Prefix 11110000 0 m 
| Operand Size Prefix 01100110 0 
| Segment Override Prefix . 
CS: 00101110 0 
DS: | 00111110 0 
ES: 00100110 ) 
FS: 01100100 ) 
GS: — | 01100101 0 
SS: | 00110110 0 
PROTECTION CONTROL 
|ARPL = Adjust Requested Privilege Level | 
From Register/Memory | 01100011 - 20/21** a . oA 
LAR = Load Access Rights | | 
From Register/Memory 00001111 | 00000010 15/16" a gh, j, p 
|LGDT = Load Global Descriptor 
Table Register 00001111 00000001 |mod010 r/m 11* b,c h, | 
LIDT = Load Interrupt Descriptor . 
Table Register 00001111 00000001 | mod011 frm). hh b,c h, | 
|LLOT = Load Local Descriptor 
Table Register to ; 
Register/Memory 00001111 00000000 | mod010. r/m 20/24" a g, h, j, | 
LMSW = Load Machine Status Word . a 
From Register/Memory 00001111 00000001 |mod110 r/m|] - 10/13* b,c h, | 
LSL = Load Segment Limit 
From Register/Memory | 00001111 | 00000011 
Byte-Granular Limit 20/21* a. g,h, j, p 
Page-Granular Limit 25/26* a g,h,j, p 
LTR = Load Task Register . 
From Register/Memory 00001111 00000000 |mod001 fr/m 23/27* a | og,hjl 
|SGDT = Store Global Descriptor | | | | 
Table Register . 00001111 | 00000001 |mod000 r/m 9* b,c h 
SIDT = Store Interrupt Descriptor 
Table Register * 00001111 00000001 | mod001 r/m 9* b,c h 
SLDT = Store Local Descriptor Table Register 
To Register/Memory 00001111 00000000 | mod000 f/m 2/2* a h 


5-952 


intel 386™ SX MICROPROCESSOR 


Table 9-1. Instruction Set Clock Count Summary (Continued) 


CLOCK COUNT NOTES 


Real Real 
INSTRUCTION FORMAT Address | Protected Address Protected 
Mode or Virtual Mode or Virtual 
Virtual Address Virtual Address 
8086 Mode 8086 Mode 
Mode Mode 


PROTECTION CONTROL (Continued) 


SMSW = Store Machine 
Status Word 00001111 00000001 |mod100 ft/m 


STR = Store Task Register 
To Register/Memory 00001111 00000000 | mod001 r/m 


= Verify Read Access 


Register/Memory 00001111 | 00000000 |mod100 r/m g,h,j, p 
= Verify Write Access 00001111 | 00000000 g,h,j, p 


INSTRUCTION NOTES FOR TABLE 9-1 


Notes a through c apply to Real Address Mode only: 

a. This is a Protected Mode instruction. Attempted execution in Real Mode will result in exception 6 (invalid opcode). 

b. Exception 13 fault (general protection) will occur in Real Mode if an operand reference is made that partially or fully 
extends beyond the maximum CS, DS, ES, FS or GS limit, FFFFH. Exception 12 fault (stack segment limit violation or not 
present) will occur in Real Mode if an operand reference is made that partially or fully extends beyond the maximum SS limit. 
c. This instruction may be executed in Real Mode. In Real Mode, its purpose is primarily to initialize the CPU for Protected 
Mode. 


Notes d through g apply to Real Address Mode and Protected Virtual Address Mode: 
d. The 386 SX CPU uses an early-out multiply algorithm. The actual number of clocks depends on the position of the most 
significant bit in the operand (multiplier). 
Clock counts given are minimum to maximum. To calculate actual clocks use the following formula: 
Actual Clock = ifm < > 0 then max ([logo |ml], 3) + b clocks: 
ifm = 0 then 3+b clocks 
In this formula, m is the multiplier, and 
b = 9 for register to register, 
b = 12 for memory to register, 
b = 10 for register with immediate to register, 
b = 11 for memory with immediate to register. 
e. An exception may occur, depending on the value of the operand. 
f. LOCK# is automatically asserted, regardless of the presence or absence of the LOCK # prefix. 
g. LOCK# is asserted during descriptor table accesses. 


Notes h through r apply to Protected Virtual Address Mode only: 
h. Exception 13 fault (general protection violation) will occur if the memory operand in CS, DS, ES, FS or GS cannot be used 
due to either a segment limit violation or access rights violation. If a stack limit is violated, an exception 12 (stack segment 
limit violation or not present) occurs. 
i. For segment load operations, the CPL, RPL, and DPL must agree with the privilege rules to avoid an exception 13 fault 
(general protection violation). The segment’s descriptor must indicate “present” or exception 11 (CS, DS, ES, FS, GS not 
present). If the SS register is loaded and a stack segment not present is detected, an exception 12 (stack segment limit 
violation or not present) occurs. 
j. All segment descriptor accesses in the GDT or LDT made by this instruction will automatically assert LOCK# to maintain 
descriptor integrity in multiprocessor systems. 
k. JMP, CALL, INT, RET and IRET instructions referring to another code segment will cause an exception 13 (general 
protection violation) if an applicable privilege rule is violated. 
I. An exception 13 fault occurs if CPL is greater than 0 (0 is the most privileged level). 
m. An exception 13 fault occurs if CPL is greater than IOPL. 
n. The IF bit of the flag register is not updated if CPL is greater than IOPL. The IOPL and VM fields of the flag register are 
updated only if CPL = 0. 
o. The PE bit of the MSW (CRO) cannot be reset by this instruction. Use MOV into CRO if desiring to reset the PE bit. 
p. Any violation of privilege rules as applied to the selector operand does not cause a protection exception; rather, the zero 
flag is cleared. 
q. If the coprocessor’s memory operand violates a segment limit or segment access rights, an exception 13 fault (general 
protection exception) will occur before the ESC instruction is executed. An exception 12 fault (stack segment limit violation 
or not present) will occur if the stack limit is violated by the operand’s starting address. 
r. The destination of a JMP, CALL, INT, RET or IRET must be in the defined limit of a code segment or an exception 13 fault 
(general protection violation) will occur. 
s/t. The instruction will execute in s clocks if CPL < IOPL. If CPL > IOPL, the instruction will take t clocks. 
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9.2 INSTRUCTION ENCODING | 


9.2.1 Overview — 


All instruction encodings are subsets of the general 
instruction format shown in Figure 8-1. Instructions 
~ consist of one or two primary opcode bytes, possibly 
an address specifier consisting of the ‘mod r/m” 
byte and “scaled index’’ byte, a displacement if re- 
quired, and an immediate data field if required. 


Within the primary opcode or opcodes, smaller en- 
coding fields may be defined. These fields vary ac- 
cording to the class of operation. The fields define 
‘such information as direction of the operation, size 
of the displacements, register encoding, or sign ex- 
tension. 


Almost all instructions referring to an operand in 
memory have an addressing mode byte following 
the primary opcode byte(s). This byte, the mod r/m 


byte, specifies the address mode to be used. Certain » 
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encodings of the mod r/m byte indicate a second 
addressing byte, the scale-index-base byte, follows 


_the mod r/m byte to fully specify the addressing 


mode. 


Addressing modes can include a displacement im- 
mediately following the mod r/m byte, or scaled in- 
dex byte. If a displacement is present, the possible. 
sizes are 8, 16 or 32 bits. 


If the instruction specifies an immediate operand, 
the immediate operand follows any displacement 
bytes. The immediate operand, if specified, is always 
the last field of the instruction. 


Figure 9-1 illustrates several of the fields that can 
appear in an instruction, such as the mod field and 
the r/m field, but the Figure does not show all fields. 
Several smaller fields also appear in certain instruc- 
tions, sometimes within the opcode bytes them- 
selves. Table 9-2 is a complete list of all fields ap- 


_ pearing in the instruction set. Further ahead, follow- 


ing Table 9-2, are detailed tables for each field. 


TATU mod TT T r/m [ss index base |a92| 16 8 | none data32| 16 | | none 


0.765320 


765320 


re ee OE ee Sener eee 


opcode “mod r/m” 
(one or two bytes) byte — 
(T represents an 


opcode bit.) 


| oj -j- -b” ; 
UE ie EE 


register and address © 


address immediate — 

byte ~ displacement , data 

(4, 2,1 bytes (4, 2, 1 bytes 
or none) or none) 


mode specifier 


Figure 9-1. General Instruction Format 


Table 9-2. Fields within Instructions | 


Field Name | Description "E see00 Number of Bits 


Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits 
Specifies Direction of Data Operation 
Specifies if an Immediate Data Field Must be Sign-Extended 


reg General Register Specifier 
mod r/m Address Mode Specifier (cieriNe Address can be a General Register) 2 for mod; 
3 for r/m 
Ss | Scale Factor for Scaled Index Address Mode | 
index | General Register to be used as Index Register 
_ base | General Register to be used as Base Register 


sreg2 
sreg3 
tttn | For Conditional Instructions, Specifies a Condition Asserted 
or a Condition Negated 7 


Note: Table 9-1 shows encoding of individual instructions. 


| Segment Register Specifier for CS, SS, DS, ES 
Segment Register Specifier for CS, SS, DS, ES, FS, GS 
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9.2.2 32-Bit Extensions of the 
Instruction Set 


With the 386 SX CPU, the 8086/80186/80286 in- 
struction set is extended in two orthogonal direc- 
tions: 32-bit forms of all 16-bit instructions are added 
to support the 32-bit data types, and 32-bit address- 
ing modes are made available for all instructions ref- 
erencing memory. This orthogonal instruction set ex- 
tension is accomplished having a Default (D) bit in 
the code segment descriptor, and by having 2 prefix- 
es to the instruction set. 


Whether the instruction defaults to operations of 16 
bits or 32 bits depends on the setting of the D bit in 
the code segment descriptor, which gives the de- 
fault length (either 32 bits or 16 bits) for both oper- 
ands and effective addresses when executing that 
code segment. In the Real Address Mode or Virtual 


8086 Mode, no code segment descriptors are used, _ 


but a D value of O is assumed internally by the 
386 SX CPU when operating in those modes (for 16- 
bit default sizes compatible with the 8086/80186/ 
80286). 


Two prefixes, the Operand Size Prefix and the Effec- 
tive Address Size Prefix, allow overriding individually 
the Default selection of operand size and effective 
address size. These prefixes may precede any op- 
code bytes and affect only the instruction they pre- 
cede. If necessary, one or both of the prefixes may 
be placed before the opcode bytes. The presence of 
the Operand Size Prefix and the Effective Address 
Prefix will toggle the operand size or the effective 
address size, respectively, to the value “opposite” 
from the Default setting. For example, if the default 
operand size is for 32-bit data operations, then pres- 
ence of the Operand Size Prefix toggles the instruc- 
tion to 16-bit data operation. As another example, if 
the default effective address size is 16 bits, pres- 
ence of the Effective Address Size prefix toggles the 
instruction to use 32-bit effective address computa- 
tions. 


These 32-bit extensions are available in all modes, 
including the Real Address Mode or the Virtual 8086 
Mode. In these modes the default is always 16 bits, 
so prefixes are needed to specify 32-bit operands or 
addresses. For instructions with more than one pre- 
fix, the order of prefixes is unimportant. 


Unless specified otherwise, instructions with 8-bit 
and 16-bit operands do not affect the contents of 
the high-order bits of the extended registers. - 


9.2.3 Encoding of Instruction Fields 


Within the instruction are several fields indicating 
register selection, addressing mode and so on. The 
exact encodings of these fields are defined immedi- 
ately ahead. 
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9.2.3.1 ENCODING OF OPERAND LENGTH (w) 
FIELD 


For any given instruction performing a data opera- 
tion, the instruction is executing as a 32-bit operation 
or a 16-bit operation. Within the constraints of the 
operation size, the w field encodes the operand size 
as either one byte or the full operation size, as 
shown in the table below. 


_ Operand Size 
During 16-Bit During 32-Bit 
Data Operations | Data Operations 


0 8 Bits 8 Bits 
{ 16 Bits 32 Bits 


9.2.3.2 ENCODING OF THE GENERAL 
REGISTER (reg) FIELD 


Operand Size 


The general register is specified by the reg field, 
which may appear in the primary opcode bytes, or as 
the reg field of the ‘“‘mod r/m” byte, or as the r/m 
field of the “mod r/m”’ byte. 


Encoding of reg Field When w Field 
is not Present in Instruction 


Register Selected Register Selected 
. During 16-Bit 
| Data Operations 


reg Field During 32-Bit 


Data Operations - 


Encoding of reg Field When w Field 
is Present in Instruction 


Register Specified by reg Field 
During 16-Bit Data Operations: | 


Function of w Field 


ae (whenw=0) | (whenw= 1) 
AL AX 
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| Register Specified by reg Field 
During 32-Bit Data Operations 


Function of w Field 


9.2.3.3 ENCODING OF THE SEGMENT 
REGISTER (sreg) FIELD | 


The sreg field in certain instructions is a 2-bit field 
allowing one of the four 80286 segment registers to 
be specified. The sreg field in other instructions is a 


3-bit field, allowing the 386 SX CPU FS and GS seg- 


ment registers to be specified. 


2-Bit sreg2 Field 

| Segment 
Register 
Selected 


2-Bit 
sreg2 Field 


3-Bit sreg3 Field 

Segment 
Register 
Selected 


3-Bit 
sreg3 Field 


do not use 
do not use 
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9.2.3.4 ENCODING OF ADDRESS MODE 


Except for special instructions, such as PUSH or 
POP, ‘where the addressing mode is pre-determined, 
the addressing mode for the current instruction is 
specified by addressing bytes following the primary 
opcode. The primary addressing byte is the “mod 
r/m”’ byte, and a second byte of addressing informa- 
tion, the ‘‘s-i-b” (scale- index-base) byte, can be 
specified. 


The s-i-b byte (scale-index-base byte) is specified 


when using 32-bit addressing mode and the ‘‘mod 
r/m” byte has r/m = 100 and-mod = 00, 01 or 10. 
When the sib byte is present, the 32-bit addressing 
mode is a function of the mod, ss, index, and base 
fields. 


- The primary addressing byte, the “mod r/m” byte, 


also contains three bits (Shown as TTT in Figure 8-1) 
sometimes used as an extension of the primary op- 
code. The three bits, however, may also be used as 
a register field (reg). 


When calculating an effective address, either 16-bit 
addressing or 32-bit addressing is used. 16-bit ad- 
dressing uses 16-bit address components to calcu- 
late the effective address while 32-bit addressing 
uses 32-bit address components to calculate the ef- 
fective address. When 16-bit addressing is used, the 
“mod r/m” byte is interpreted as a 16-bit addressing . 
mode specifier. When 32-bit addressing is used, the 
“mod r/m’’ byte is interpreted as a 32-bit addressing 
mode specifier. 


Tables on the ‘ilowingst three pages define all en- 


' codings of all 16-bit addressing modes and 32-bit 


eccieeolng modes. 
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Encoding of 16-bit Address Mode with “mod r/m” Byte 


Effective Address Effective Address 


DS:[BX+ SI] DS: [BX + SI+ d16] 
DS:[BX + DI] DS: [BX + DI+ d16]. 
SS:[BP + SI1] SS: [BP + Si+ d16] 
SS:[BP + Di] SS:[BP + DI+d16] 
Ds:[SI] DS:[SI + d16] 
DS:[D1] DS:{DI+ d16] 
DS:d16 SS:[BP + d16] 
DS:[BXx] DS:[BX + d16] 


DS:[BX + SI+ d8] 
DS:[BX + Di+ d8] 
SS:[BP + SI+ d8] | 
SS:[BP + DI+ d8] 
DS:[SI+ d8] 
DS:[DI+ d8] 
SS:[BP + d8] 
DS:[BX + d8] 


register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 


Register Specified by r/m 
During 16-Bit Data Operations 


Function of w Field 
mod r/m 
(when w=0) | (when w= 1) _ 


Register Specified by r/m 


During 32-Bit Data Operations 


Function of w Field . 
mod r/m 
T(whenw=0) | (whenw =% 
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Encoding of 32-bit Address Mode with “mod r/m” byte (no “‘s-i-b” byte present): | 


modr/m | — Effective Address 


~ DS:[EAX] 
DS:[ECX] 
DS: [EDX] 
DS: [EBX] 

$-i-b is present 

DS:d32 
DS:[ESI] 
DS:[EDI] 


modr/m |. Effective Address 


DS:[EAX + d32] 
DS:[ECX + d32] 
DS:[EDX + d32] 
DS: [EBX + d32] 

s-i-b is present 
SS: [EBP + d32] 
DS: [ESI + d32] 
DS: [EDI + d32] 


DS: [EAX + d8] 
DS: [ECX + d8] 
DS: [EDX + d8] 
DS: [EBX + d8] 
s-i-b is present 

SS:[EBP + d8] 
DS: [ESI + d8] 

DS: [EDI + d8] 


register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 


~ Register Specified by reg or r/m 
during 16-Bit Data Operations: 


mod r/m function of w field 
| (whenw=0) | (when w=1) 


Register Specified by reg orr/m | 
during 32-Bit Data Operations: 


function of w field | 
mod r/m , , 
(when w= 0) (when w= 1) 


AL 
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Encoding of 32-bit Address Mode (“mod r/m” byte and “s-i-b” byte present): 


Effective Address Foss Scale Factor 


DS:[EAX + (scaled index)] 00 x1 
DS:[ECX + (scaled index)] 01 x2 
DS:[EDX + (scaled index)] 10 x4 


DS:[EBX + (scaled index)] 
SS: [ESP + (scaled index)] 
DS:[d32 + (scaled index)] 
DS:[ESI-+ (scaled index)] 
DS:[EDI + (scaled index)] 


11 x8 


EAX 
ECX 
EDX 


DS:[EAX + (scaled index) + d8] 
DS:[ECX + (scaled index) + d8] 
DS:[EDX + (scaled index) + d8] 
DS:[EBX + (scaled index) + d8] 
SS:[ESP + (scaled index) + d8] 
SS:[EBP + (scaled index) + d8] 
DS:[ESI-+ (scaled index) + d8] 


DS:[EDI + (scaled index) + d8] **IMPORTANT NOTE: | 

When index field is 100, indicating ‘‘no index register,” then 
ss field MUST equal 00. If index is 100 and ss does not 
equal 00, the effective address is undefined. 


EBX 
no index reg** 
EBP 
ESI 
EDI 


DS:[EAX + (scaled index) + d32] 
DS:[ECX + (scaled index) + d32] 
DS:[EDX + (scaled index) + d32] 
DS:[EBX + (scaled index) + d32] 
SS:[ESP + (scaled index) + d32] 
SS:[EBP + (scaled index) + d32] 
DS:[ESI-+ (scaled index) + d32] 

DS:[EDI + (scaled index) + d32] 


NOTE: . 
Mod field in “mod r/m” byte; ss, index, base fields in 
“s-i-b” byte. 
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9.2.3.5 ENCODING OF OPERATION DIRECTION 
(d) FIELD | 


Ss we pa 


ent to indicate which operand is considered the 
source and which is the destination. 


id | Direction of Operation 


Register/Memory <- - Register 
“reg” Field Indicates Source Operand; 

“mod r/m” or “mod ss index base”’ Indicates 
Destination Operand : 


Register <- - Register/Memory 
“reg” Field Indicates Destination Operand; 

“mod r/m” or “mod ss index base’”’ Indicates 
Source Operand 


9.2.3.6 ENCODING OF SIGN-EXTEND (s) FIELD © 


The s field occurs primarily to instructions with im- 
mediate data fields. The s field has an effect only if 
the size of the immediate data is 8 bits and is being 
placed in a 16-bit or 32-bit destination. 


Effecton §| _ Effecton 
immediate Data 16/3 


immediate Data8 . 
None 


1|Sign-Extend Data8 to Fill None 


16-Bit or 32-Bit Destination 


9.2.3.7 ENCODING OF CONDITIONAL TEST 
} (tttn) FIELD 


For the conditional instructions (conditional jumps 


and set on condition), tttn is encoded with n indicat- 
ing to use the condition (n= 0) or its negation (n= 1), 
and ttt giving the condition to test. 
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Overflow 

No Overiiow 

Below/Not Above or Equal 

Not Below/Above or Equal 
.|Equal/Zero 

Not Equal/Not Zero 

Below or Equal/Not Above 


Not Below or Equal/Above 


Not Parity/Parity Odd 
Less Than/Not Greater or Equal 
Not Less Than/Greater or Equal 


9.2.3.8 ENCODING OF CONTROL OR DEBUG 
OR TEST REGISTER (eee) FIELD 


For the loading and storing of the Control, Debug 
and Test registers. — | 


When Interpreted as Control Register Field 


Do not use any other encoding 7 


When Interpreted as Debug Register Field 


|___eeeCode | RegName 


Do not use any other encoding . 


When Interpreted as Test Register Field 


110 TRE : 
| 111 TR7 


Do not use any other encoding 
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DATA SHEET REVISION REVIEW 


The following list represents key differences between this and the -001 version of the 386™ SX microproces- 
sor data sheet. Please review this summary carefully.. 


The section significantly revised since version -002 is: 


Section 1.0 Figure 1.1 was modified to also give pin names. Table 1.1 was modified to list pin names 
in alphabetical order. 


The sections significantly revised since version -003 are: 
Section7.3 Table 7.3 modified to show new Icc values at 16 MHz and 20 MHz. 


Section 7.4 Add 20 MHz A.C. Specifications in Table 7.5. Modified capacitive derating moomaton in 
Tables 7.8 through 7.11. Modified typical Icc vs. frequency in Table 7.12. 


The sections significantly revised since version -004 are: 


Section 5.4 Added Section on FLT #. 

Section 7.3 Table 7.3 modified to show the FLT # function and Tcase at 100°C. 

Section 7.4 Changed T 14 to 4 ns. Deleted Figure 7.10. 

The section significantly revised since version -005 are: 

Section 1.0 Pin Description was modified to add ONCE description in Symbol FLT # section. 

Section 7.3 _ Table 7.3 modified to show Low Power 386 SX Icc Max value and typical value at 
different frequency. 

Section 7.4 Merge 20 MHz and 16 MHz standard 386 SX A.C. specification in Table 7.4. 


Add Low Power 386 SX 20 MHz, 16 MHz and 12 MHz A.C. Specification as Table 7.5. 
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387™ SX 
MATH COPROCESSOR 
m Interfaces with 286™ SX = Upward Cbject-Code Compatibie from 
Microprocessor 8087 and 80287 | 
m Expands 386 SX CPU Data Types to mg Directly Extends 386 SX CPU 
Include 32-, 64-, 80-Bit Floating Point, Instruction Set to Trigonometric, 
32-, 64-Bit Integers and 18-Digit BCD Logarithmic, Exponential, and 
Operands Arithmetic Instructions for All Data 
m High Performance 80-Bit Internal Types 
Architecture mw Full-Range Transcendental Giasrations 
; for SINE, COSINE, TANGENT, 
Two to Three Times 8087/80287 
Performance at Equivalent Clock Speed ARCTANGENT, and LOGARITHM. 
. m Operates Independently of Real, 
Ucn aa Protected, and Virtual-8086 Modes of 
Arithmetic the 386 SX Microprocessor 
w Fully compatible with the 387™ Math m Eight 80-Bit Numeric Registers, Usable 
Coprocessor. Implements all 387 NPX as Individually Addressable General 
architectural enhancements over 8087 Registers or as a Register Stack 
and 80287. m Available in a 68-pin PLCC Package 


(see Packaging Specs: Order #231369) 


The Intel 387T™ SX Math CoProcessor is an extension to the Intel 386™ microprocessor architecture. The 
combination of the 387 SX with the 386™ SX Microprocessor dramatically increases the processing speed of 
computer application software which utilizes mathematical operations. This makes an ideal computer worksta- 
_tion platform for applications such as financial modeling and spreadsheets, CAD/CAM, or graphics. 


The 387 SX Math CoProcessor adds over seventy mnemonics to the 386 SX Microprocessor instruction set. 
Specific 387 SX math operations include logarithmic, arithmetic, exponentional, and triginometric functions. 
The 387 SX supports integer, extended integer, floating point and BCD data formats, and fully conforms to the 
ANSI/IEEE floating point standard. 


The 387 SX Math CoProcessor is object code compatible with the 387™ DX and upward object code compati- 
ble from the 80287 and 8087 Math CoProcessors. The 387 SX is manufactured with Intel’s CHMOS Il 
technology and packaged in a 68-pin PLCC package. A low power consumption option allows use in laptop or 
‘portable applications. — 


BUS CONTROL LOGIC FLOATING POINT UNIT 


DBUS INTERFACE 
DATA ALIGNMENT AND OPERAND CHECKING 


' DATA INTERFACE AND CONTROL UNIT \ 


i 32 
STATUS WORD 


16 
CONTROL WORD 


INTERNAL | 
DATA 
BUS 


TAG = WORD 


EXPONENT ADDER 
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Figure 1-1. 386™ SX Microprocessor and 387™ SX Math Coprocessor Register Set 


1.0 FUNCTIONAL DESCRIPTION 


The 387T™ SX Math Coprocessor Extension (NPX) 
provides arithmetic instructions for a variety of nu- 
meric data types. It also executes numerous built-in 
transcendental functions (e.g. tangent, sine, cosine, 
and log functions). The 387 SX NPX effectively ex- 
tends the register and instruction set of its CPU for 
existing data types and adds several new data types 
as well. Figure 1-1 shows the model of registers visi- 
ble to 386™ SX Microprocessor and 387 SX Math 
Coprocessor applications programs. Essentially, the 
387 SX Math Coprocessor can be treated as an ad- 
ditional resource or an extension to the 386 SX Mi- 
croprocessor. The 386 SX Microprocessor together 
with a 387 SX NPX can be used as a single unified 
system, the 386 SX Microprocessor and 387 SX 
Math Coprocessor. | 


The 387 SX Numerics Coprocessor Extension works 
the same whether the CPU is executing in real-ad- 
dress mode, protected mode, or virtual-8086 mode. 
All references to memory for numerics data or status 
information are performed by the CPU, and there- 
fore obey the memory-management and protection 
rules of the CPU mode currently in effect. The 387 
SX Numerics Coprocessor Extension merely oper- 
ates on instructions and values passed to it by the 


CPU and therefore is not sensitive to the processing 
mode of the CPU. 


In real-address mode and virtual-8086 mode, the 
386 SX Microprocessor and 387 SX Math Coproces- 
sor is completely upward compatible with software 
for the 8086/8087 and 80286/80287 real-address 
mode systems. | 


In protected mode, the 386 SX Microprocessor and 
387 SX Math Coprocessor is completely upward 
compatible with software for the 80286/80287 pro- 
tected mode system. 


In all modes, the 386 SX Microprocessor and 387 
SX Math Coprocessor is completely compatible with 
software for the 386™ Microprocessor/387™ Math 
Coprocessor system. 


The only differences of operation that may appear 
when 8086/8087 programs are ported to the pro- 
tected-mode 386 SX Microprocessor and 387 SX 
Math Coprocessor system (not using virtual-8086 
mode) is in the format of operands for the adminis- 
trative instructions FLDENV, FSTENV, FRSTOR, 
and FSAVE. These instruction are normally used 
only by exception handlers and operating systems, 

not by applications programs. 
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2.0 PROGRAMMING INTERFACE 
_ The 387 SX NPX adds to an 386 SX Microprocessor 


system additional data types, registers, instructions, 
and interrupts specifically designed to facilitate high- 
speed numerics processing. To use the 387 SX NPX 
requires no special programming tools, because all 
new instructions and data types are directly support- 
ed by the assembler and compilers for high-level 
languages. All 386 Microprocessor development 
tools that support 387 NPX programs can also be 
used to develop software for the 386 SX Microproc- 
essor and 387 SX Math Coprocessor. All 8086/8088 
development tools that support the 8087 can also 
be used to develop software for the 386 SX Micro- 
processor and 387 SX Math Coprocessor in real-ad- 
dress mode or virtual-8086 mode. All 80286 devel- 
opment tools that support the 80287 can also be 
used to develop software for the 386 SX Microproc- 
essor and 387 SX Math Coprocessor. 


The 387 SX NPX supports all 387 NPX instructions. 
The 386 SX Microprocessor and 387 SX Math Co- 
processor supports all the same programs and gives 
the same results as an 386 Microprocessor and 387 
Math Coprocessor. 


All communication between the CPU and the NPX is 
transparent to applications software. The CPU auto- 
matically controls the NPX whenever a numerics in- 
struction is executed. All physical memory and virtu- 
al memory of the CPU are available for storage of 
the instructions and operands of programs that use 
the NPX. All memory addressing modes, including 
use of displacement, base register, index register, 
and scaling, are available for aceiessing numerics 
operands. 


Section 7 at the end of this data sheet lists by class 
the instructions that the 387 SX NPX adds to the 
instruction set of an 386 SX Microprocessor system. 
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2.1 Data Types 


Table 2-1 lists the seven data types that the NPX 
supports and presenis the format for each type. Op- 
erands are stored in memory with the least signifi- 
cant digit at the lowest memory address. Programs 
retrieve these values by generating the lowest ad- 
dress. For maximum system performance, all oper- 
ands should start at physical-memory addresses 
that correspond to the word size of the CPU; oper- 
ands may begin at any other addresses, but will re- 
quire extra memory cycles to access the entire oper- 
and. 


Internally, the NPX holds all numbers in the extend- 
ed-precision real format. Instructions that load oper- 
ands from memory automatically convert operands 
represented in memory as 16-, 32-, or 64-bit inte- 
gers, 32- or 64-bit floating-point numbers, or 18-digit 
packed BCD numbers into extended-precision real 
format. Instructions that store operands in memory | 
perform the inverse type conversion. 


2.2 Numeric Operands 


A typical NPX instruction accepts one or two oper- 
ands and produces one (or sometimes two) results. 
In two-operand instructions, one operand is the con- 
tents of an NPX register, while the other may be a 
memory location. The operands of some instructions 
are predefined; for example, FSQRT always takes 
the square root of the number! in the top stack ele- 
ment. 
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Table 2-1. 387T™ SX NPX Data Type Representation in Memory 


Most Significant Byte = HIGHEST ADDRESSED BYTE 


Dats Range Precision 
Formats | 70 7 O17 OF7 O}7 O17 O 7 0 
6) 


—— oe font 
ene foal onct 


Long Integer | +1018 
Packed BCD 


Single Precision 


iTwosS 
COMPLEMENT) 


(Two's 
COMPLEMENT) 


{Two's 
COMPLEMENT) 


MAGNITUDE 


BIASED 
| EXPONENT SIGNIFICAND 
BIASED 
EXPONENT 
BIASED | 
E EXPONENT | SIGNIFICANO 
79 


64 63 0 


Double Precision 


Extended 
Precision 
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NOTES: 
(1) S = Sign bit (0 = positive, 1 = negative) 
(2) d, = Decimal digit (two per byte) 
(3) X = Bits have no significance; NPX ignores when loading, zeros when storing 
(4) 4 = Position of implicit binary point 
(5) | = Integer bit of significand; stored in temporary real, implicit in single and double precision 
(6) Exponent Bias (normalized values): 
Single: 127 (7FH) 
Double: 1023 (3FFH) 
Extended REal: 16383 (SFFFH) 
(7) Packed BCD: (—1)§ (D47..Do) 
(8) Real: (— 1)S (2E-BIAS) (Fo Fy...) 


5-967 


intel 


2.3 Register Set 


Figure 1-1 shows the 387 SX ‘NPX register set. 
When an NPx is present in a system, programmers 
may use these registers in addition to the registers 
normally available on the CPU. 


2.3.1 DATA REGISTERS 


387SX NPX computations use the NPX’s data regis- 
ters. These eight 80-bit registers provide the equiva- 
lent capacity of 20 32-bit registers. Each of the eight 
data registers in the NPX is 80 bits wide and is divid- 
ed into “fields” corresponding to the NPX’s extend- 
ed-precision real data type. 


The NPX register set can be accessed either as a 
stack, with instructions operating on the top one or 
two stack elements, or as individually addressable 
registers. The TOP field in the status word identifies 
the current top-of-stack register. A “push” operation 
decrements TOP by one and loads a value into the 
new top register. A “pop” operation stores the value 
from the current top register and then increments 
TOP by one. The NPX register stack grows “down” 
toward lower-addressed registers. 


Instructions may address the data registers either 
implicitly or explicitly. Many instructions operate on 
the register at the TOP of the stack. These instruc- 


_. tions implicitly address the register at which TOP 
~ points. Other instructions allow the programmer to 


explicitly specify which register to use. This explicit 
register addressing is also relative to TOP. : 
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2.3.2 TAG WORD 


The tag word marks the content of each numeric 
data register, as Figure 2-1 shows. Each two-bit tag 
represents one of the eight data registers. The prin- 
cipal function of the tag word is to optimize the 


_NPX’s performance and stack handling by making it 


possible to distinguish between empty and nonemp- 
ty register locations. It also enables exception han- 
dlers to identify special values (e.g. NaNs or denor- 
mals) in the contents of a stack location without the 
need to perform complex decoding of the actual 
data. 


2.3.3 STATUS WORD 


The 16-bit status word (in the status register) shown 
in Figure 2-2 reflects the overall state of the NPX. It 
may be read and inspected by programs. 


Bit 15, the B-bit (busy bit) is included for 8087 com- 
patibility only. It always has the same value as the 
ES bit (bit 7 of the status word); it does not indicate 
the status of the BUSY # output of NPX. — 


Bits 13-11 (TOP) point to the NPX register that is 
' the current top-of-stack. 


The four numeric condition code bits (Cs—-Cp) are 
similar to the flags in a CPU; instructions that per- 


- form arithmetic operations update these bits to re- 
_ flect the outcome. The effects of these instructions 


on the condition code are summarized in Tables 2-2 


through 2-5. 


15 | ae a Oo . 


NOTE: 


The index i of tag(i) is not top-relative. A program typically uses the “top” field of Status Word to determine which tag(i) 
field refers to logical top of stack. 
TAG VALUES: 

00 = Valid 

01 = Zero 

10 = QNaN, SNaN, Infinity, Denormal and Unsupported Formats 

11 = Empty 


Figure 2-1. Tag Word 
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TOP 
3 2}i1jo}sirjelelele 


ERROR SUMMARY STATUS 
STACK FLAG 


EXCEPTION FLAGS: 
PRECISION 
UNDERFLOW 
OVERFLOW 
ZERO DIVIDE 
DENORMALIZED OPERAND 
INVALID OPERATION 
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BUSY 
TOP OF STACK POINTER 
CONDITION CODE 
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ES is set if any unmasked exception bit is set; cleared otherwise. See Table 2-2 for interpretation of condition code. 


TOP values: 
000 = Register 0 is Top of Stack 
001 = Register 1 is Top of Stack 


111 = Register 7 is Top of Stack 
For definitions of exceptions, refer to the section entitled “Exception Handling” 


Figure 2-2. Status Word 


Bit 7 is the error summary (ES) status bit. This bit is 
set if any unmasked exception bit is set; it is clear 
otherwise. If this bit is set, the ERROR# signal is 
asserted. 


Bit 6 is the stack flag (SF). This bit is used to distin- 
guish invalid operations due to stack overflow or un- 
derflow from other kinds of invalid operations. When 
SF is set, bit 9 (Cy) distinguishes between stack 
overflow (C; = 1) and underflow (C = 0). 


Figure 2-2 shows the six exception flags in bits 5-0 
of the status word. Bits 5-0 are set to indicate that 
the NPX has detected an exception while executing 
an instruction. A later section entitled ‘Exception 
Handling’ explains how they are set and used. 


Note that when a new value is loaded into the status 
word by the FLDENV or FRSTOR instruction, the 
value of ES (bit 7) and its reflection in the B-bit (bit 
15) are not derived from the values loaded from 
memory but rather are dependent upon the values of 
the exception flags (bits 5-0) in the status word and 
their corresponding masks in the control word. If ES 
is set in such a case, the ERROR# output of the 
NPX is activated immediately. 
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Table 2-2. Condition Code Interpretation 


C1(A) 


FPREM, FPREM1 > ia Three least significant bits: 
Reduction 
(see Table 2.3) of quotient a 
0 = complete 


Q20 Qo Q1 


or O/U# 1 = incomplete 


FCOM, FCOMP, 


Result of comparison 


FCOMPP, FTST, Zero Operand is not 
FUCOM, FUCOMP, (see Table 2.4) or O/U# comparable 
FUCOMPP, FICOM, | (Table 2. 4) 


FICOMP 


FXAM _ Operand class © Sign Operand class 
(see Table 2.5) or O/ U # (Table 2.5) 


FCHS, FABS, FXCH, 
FINCSTP, FDECSTP, 
Constant loads, 
FXTRACT, FLD, 
FILD, FBLD, 
FSTP (ext real) . 


zero 


F 
UNDEFINED or O/U# 


UNDEFINED 


FIST, FBSTP, 

FRNDINT, FST, 

FSTP, FADD, FMUL, seid | 

FDIV, FDIVR, UNDEFINED UNDEFINED 
FSUB, FSUBR, | or O/U# 


FSCALE, FSQRT, 
-FPATAN, F2XM1, 
~ FYL2X, FYL2XP1- 


FPTAN, FSIN Roundup Reduction 
FCOS, FSINCOS UNDEFINED or O/U#, 0 = complete 
| | —— undefined 1 = incomplete 
ifC2 = 1 


FLDENV, FRSTOR | Each bit loaded from memory 
-FLDCW, FSTENV, a | 

FSTCW,FSTSW, UNDEFINED 
FCLEX, FINIT, : 

FSAVE 


O/U# When both. IE and SF bits of status word are set, indicating a stack ree ven: this bit 
distinguishes between stack overflow (C1 = 1) and underflow (C1 =0). 


Reduction lf FPREM or FPREM1 produces a remainder that is less than the modulus, reduction is 
complete. When reduction is incomplete the value at the top of the stack is a partial remain- 
der, which can be used as input to further reduction. For FPTAN, FSIN, FCOS,and FSIN-- 
COS, the reudction bit is set if the oeprand at the top of the stack is too large. In this case 
the original operand remains at the top of the stack. 


Roundup When the PE bit of the status word is set, this bit indicates whether the last rounding in the 
instruction was upward. 


UNDEFINED Do not rely on finding any specific value in these bits. 
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Table 2-3. Condition Code Interpretation after FPREM and FPREM1 Instructions 


td Anat ld ohare Interpretation after FPREM and FPREM1 


Incomplete Reduction: 
further interation required 
for complete reduction 


Go 
oh 
oO 


NO 


~ QMOD8 


Complete Reduction: 
CO, C3, C1 contain three least — 
significant bits of quotient 


voce 


Table 2-4. Condition Code Resulting from Comparison 


TOP > Operand 


TOP < Operand 
TOP = Operand 
Unordered 


Table 2.5. Condition Code Defining Operand Class 


+ Unsupported 
+ NaN 

— Unsupported 
— NaN 

+ Normal 

+ Infinity 

— Normal 

— Infinity 


~— Empty — 
+ Denormal 
— Denormal 


—-~-3--s oso 4A QOOdDAOCDACO 
-O-R+00-+00-=-00 
oo-0o+-0-0=;02,0-0 


0 
0 
0 
mn) 
4 
AL 
, 
, 
0 
0 
0 
0 
1 
1 
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a Gees ————— RESERVED 
, , RESERVED* 
it! ROUNDING CONTROL 


RESERVED 


EXCEPTION MASKS: 
PRECISION 
UNDERFLOW 
OVERFLOW 
ZERO DIVIDE. 
DENORMALIZED OPERAND 
INVALID OPERATION —— 


Precision Control 
00—24 bits (single precision) 
01—(reserved) 
10—53 bits (double precision) 
11—64 bits (extended precision) 


2.3.4 CONTROL WORD 


The NPX provides several processing options that 
are selected by loading a control word from memory 
into the control register. Figure 2-3 shows the format 
and encoding of fields in the contro! word. 


The low-order byte of this control word configures 
exception masking. Bits 5-0 of the control word 
contain individual masks for each of the six excep- 
tions that the NPX recognizes. 


The high-order byte of the control word configures 
the NPX operating mode, including precision, round- 
ing, and infinity control. 


e The “infinity control bit” (bit 12) is not meaningful 
to the 387 SX NPX, and programs must ignore its 
value. To maintain compatibility with the 8087 


and 80287, this bit can be programmed; however, © 
regardless of its value, the 387 SX NPX always 


treats infinity in the affine sense (— 2% < +9), 
This bit is initialized to zero both after a hardware 
reset and after the FINIT instruction. 
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PRECISION CONTROL 


*"O" AFTER RESET OR FINIT; 
CHANGEABLE UPON LOADING THE 
CONTROL WORD (CW). PROGRAMS 
MUST IGNORE THIS BIT. 


240225-4 
Rounding Control . 
00—Round to nearest or even 
01—Round down (toward — °°) 
10—Round up (toward + oo) 
11—Chop (truncate toward zero) 


Figure 2-3. Control Word 


e The rounding control (RC) bits (bits 11-10) pro- 
vide for directed rounding and true chop, as well 
as the unbiased round to nearest even mode 
specified in the IEEE standard. Rounding control 
affects only those instructions that perform 
rounding at the end of the operation (and thus 
can generate a precision exception); namely, 
FST, FSTP, FIST, all arithmetic instructions (ex- 
cept FPREM, FPREM1, FXTRACT, FABS, and 
FCHS), and all transcendental instructions. 


e The precision control (PC) bits (bits 9-8) can be 
used to set the NPX internal operating precision 
of the significand at less than the default of 64 — 
bits (extended precision). This can be useful in 
providing compatibility with early generation arith- 
metic processors of smaller precision. PC affects 
only the instructions ADD, SUB, DIV, MUL, and 
SQRT. For all other instructions, either the preci- 
sion is determined by the opcode or extended 
precision is used. 


intel 


2.3.5 INSTRUCTION AND DATA POINTERS 


Because the NPX operates in parallel with the CPU, 
any exceptions detected by the NPX may be report- 
ed after the CPU has executed the ESC instruction 
which caused it. To allow identification of the failing 
numeric instruction, the 386 SX Microprocessor and 
387 SX Math Coprocessor contains registers that 
aid in diagnosis. These registers supply the address 
of the failing instruction and the address of its nu- 
meric memory operand (if appropriate). 


The instruction and data pointers are provided for 
user-written exception handlers. These registers are 
actually located in the CPU, but appear to be located 
in the NPX because they are accessed by the ESC 
instructions FLDENV, FSTENV, FSAVE, and 
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FRSTOR. Whenever the CPU executes a new ESC 
instruction, it saves the address of the instruction 
(including any prefixes that may be present), the ad- 
dress of the operand (if present), and the opcode. 


The instruction and data pointers appear in one of 
four formats depending on the operating mode of 
the CPU (protected mode or real-address mode) 
and depending on the operand-size attribute in ef- 
fect (32-bit operand or 16-bit operand). (See Figures 


2-4, 2-5, 2-6, and 2-7.) The ESC instructions 


FLDENV, FSTENV, FSAVE, and FRSTOR are used 
to transfer these values between the registers and 
memory. Note that the value of the data pointer is 
undefined if the prior ESC instruction did not have a 


“memory operand. 


32-BIT PROTECTED MODE FORMAT 


31 | 23 15 7 0 
RESERVED CONTROL WORD 


RESERVED STATUS WORD 
RESERVED | TAG WORD 


eae Meee 


Figure 2-4.instruction and Data Pointer Image in Memory, 32-bit Protected-Mode Format 
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_ 16-BIT PROTECTED MODE FORMAT 
15 | 7 


15 7 | 0 
INSTRUCTION POINTER 15..0 
[seco | weunouronnenate—[ » | oberon 


Figure 2-6. Instruction and Data Pointer Image in Memory, 32-bit Real-Mode Format 


32-BIT REAL-ADDRESS MODE FORMAT 
31 23 


16-BIT REAL-ADDRESS MODE AND VIRTUAL 8086 MODE FORMAT 


15 0 


, 7 rs 
_ CONTROL WORD 


STATUS WORD 
TAG WORD 


[_emmerginenisn 
| opie1s Jojo 0000000000 


Figure 2-7. Instruction and Data Pointer Image in Memory, 16-bit Real-Mode Format 
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Table 2-6. CPU Interrupt Vectors Reserved for NPX 


Interrupt 
Cause of Interrupt 


An ESC instruction was encountered when EM or TS of CPU control register zero (CRO) was 
set. EM = 1 indicates that software emulation of the instruction is required. When TS is set, 
either an ESC or WAIT instruction causes interrupt 7. This indicates that the current NPX 
ESC instruction that caused the exception, including any prefixes. The NPX has not executed 


7 

context may not belong to the current task. 
3 

this instruction; the instruction pointer and data pointer register refer to a previous, correctly 
6 


In a protected-mode system, an operand of a coprocessor instruction wrapped around an 
addressing limit (OFFFFH for expand-up segments, zero for expand-down segments) and 
spanned inaccessible addresses®@. The failing numerics instruction is not restartable. The 
address of the failing numerics instruction and data operand may be lost; an FSTENV does not 
return reliable addresses. The segment overrun exception should be handled by executing an 
FNINIT instruction (i.e. an FINIT without a preceding WAIT). The exception can be avoided by 
never allowing numerics operands to cross the end of a segment. 


In a protected-mode system, the first word of a numeric operand is not entirely within the limit 
of its segment. The return address pushed onto the stack of the exception handler points at the 


| executed instruction. 


1 
1 The previous numerics instruction caused an unmasked exception. The address of the faulty 
instruction and the address of its operand are stored in the instruction pointer and data pointer 
registers. Only ESC and WAIT instructions can cause this interrupt. The CPU return address 
pushed onto the stack of the exception handler points to a WAIT or ESC instruction (including 
prefixes). This instruction can be restarted after clearing the exception condition in the NPX. 


FNINIT, FNCLEX, FNSTSW, FNSTENV, and FNSAVE cannot cause this interrupt. 


a. An operand may wrap around an addressing limit when the segment limit is near an addressing limit and the operand is 
near the largest valid address in the segment. Because of the wrap-around, the beginning and ending addresses of such an 
operand will be at opposite ends of the segment. There are two ways that such an operand may also span inaccessible 
addresses: 1) if the segment limit is not equal to the addressing limit (e.g. addressing limit is FFFFH and segment limit is 
FFFDH) the operand will span addresses that are not within the segment (e.g. an 8-byte operand that starts at valid offset 
FFFCH will span addresses FFFC—FFFFH and 0000-0003H; however addresses FFFEH and FFFFH are not valid, because 
they exceed the limit); 2) if the operand begins and ends in present and accessible segments but intermediate bytes of the 


operand fall in a not-present page or in a segment or page to which the procedure does not have access rights. 


2.4 Interrupt Description 


CPU interrupts are used to report exceptional condi- 
tions while executing numeric programs in either real 
or protected mode. Table 2-6 shows these interrupts 
and their functions. 


2.5 Exception Handling 


The NPX detects six different exception conditions 
that can occur during instruction execution. Table 2- 
7 lists the exception conditions in order of prece- 
dence, showing for each the cause and the default 
action taken by the NPX if the exception is masked 
by its corresponding mask bit in the control word. 


Any exception that is not masked by the control 
word sets the corresponding exception flag of the 
status word, sets the ES bit of the status word, and 
asserts the ERROR# signal. When the CPU at- 
tempts to execute another ESC instruction or WAIT, 
exception 16 occurs. The exception condition must 
be resolved via an interrupt service routine. The re- 
turn address pushed onto the CPU stack upon entry 


to the service routine does not necessarily point to 
the failing instruction nor to the following instruction. 
The CPU saves the address of the floating-point in- 
struction that caused the exception and the address 
of any memory operand required by that instruction. 


2.6 Initialization 


After FNINIT or RESET, the control word contains 
the value 037FH (all exceptions masked, precision 
control 64 bits, rounding to nearest) the same values 
as in an 80287 after RESET. For compatibility with 


’ the 8087 and 80287, the bit that used to indicate 


infinity control (bit 12) is set to zero; however, re- 
gardiess of its setting, infinity is treated in the affine 
sense. After FNINIT or RESET, the status word is 
initialized as follows: 


e All exceptions are set to zero. 


e Stack TOP is zero, so that after the first push the 
stack top will be register seven (1118). 


e The condition code C3—-Cpo is undefined. 
e The B-bit is zero. 
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_ Table 2-7. Exceptions 


Exception | Cause 


Invalid — 
Operation 


Operation ona signalling NaN, unsupported format, 
indeterminate for (0-00, 0/0, (+ 2%) + (— ©), ete.), or 
stack overflow/underflow (SF is also set) | 


Default Action 
(if exception is masked) _ 

Result is a quiet NaN, integer 

indefinite, or BCD indefinte 


Denormalized | Atleast one of the operands is denormalized, i.e., it has Normal processing 
Operand _the smallest exponent but a nonzero significand. === | continues 


| Zero Divisor | The divisor is zero while the dividend is a noninfinite, 
: nonzero number | : 
The result is too large in magnitude to fit in the specified 
format , : , 


The true result is nonzero but too small to be 
represented in the specified format, and, if underflow 


Underflow 


Result is 0° | | 
Result is largest finite 


value or © . 


Result is denormalized 
| or zero 


exception is masked, denormalization causes the loss of 


accuracy. 


— Inexact 
Result 
(Precision ~according to the rounding mode. 

The tag word contains FEFFH (all stack locations 

are empty). . 


The 386 SX Microprocessor and 387 SX Math Co- 
processor initialization software must execute an 
FNINIT instruction (i.e an FINIT without a preceding 
WAIT) after RESET. The FNINIT is not strictly re- 
quired for the 80287 software, but Intel recommends 
its use to help ensure upward compatibility with oth- 
er processors. After a hardware RESET, the ER- 
ROR# output is asserted to indicate that a 387 SX 
NPX is present. To accomplish this, the IE and ES 
bits of the status word are set, and the IM bit in the 
control word is cleared. After FNINIT, the status 
word and the control word have the same values as 
in an 80287 after RESET. 


2.7 8087 and 80287 Compatibility 


This section summarizes the differences between 
the 387 SX NPX and the 80287. Any migration from 
the 8087 directly to the 387 SX NPX must also take 
into account the differences between the 8087 and 
the 80287 as listed in Appendix A. 


Many changes have been designed into the 387 SX 
NPX to directly support the IEEE standard in hard- 
ware. These changes result in increased perform- 
ance by eliminating the need for software that sup- 
ports the standard. | 


The true result is not exactly representable in the 
specified format (e.g. 1/3); the result is rounded 


Normal processing 
continues 


(2.7.1 GENERAL DIFFERENCES 


The 387 SX NPX supports only affine closure for - 
infinity arithmetic, not projective closure. 


| Operands for FSCALE and FPATAN are no longer 


restricted in range (except for + °°); F2XM1 and 


FPTAN accept a wider range of operands. 


Rounding control is in effect for FLD constant. 


Software cannot change entries of the tag word to 
values (other than empty) that differ from actual reg- 
ister contents. : 


After reset, FINIT, and incomplete FPREM, the 387 
SX NPX resets to zero the condition code bits C3- 
Co of the status word. | : 


In conformance with the IEEE standard, the 387 SX 
NPX does not support the special data formats 
pseudozero, pseudo-NaN, pseudoinfinity, and un- 
normal. ay : 


The denormal exception has a different purpose on 
the 387 SX NPX. A system. that.uses the denormal- 
exception handler solely to normalize the denormal 
operands, would better mask the denormal excep- 
tion on the 387 SX NPX. The 387 SX NPX automati- 
cally normalizes denormal operands when the de- 
normal exception is masked. 7 
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2.7.2 EXCEPTIONS 


A number of differences exist due to changes in the 
IEEE standard and to functional improvements to 
the architecture of the 387 SX NPX: 


1. When the overflow or underflow exception is 
masked, the 387 SX NPX differs from the 80287 
in rounding when overflow or underflow occurs. 
The 387 SX NPX produces results that are con- 
sistent with the rounding mode. 


2. When the underflow exception is masked, the 
387 SX NPX sets its underflow flag only if there 
is also a loss of accuracy during denormailiza- 
tion. 


3. Fewer invalid-operation exceptions due to de- 
normal operands, because the _ instructions 
FSQRT, FDIV, FPREM, and conversions to BCD 
or to integer normalize denormal operands be- 
fore proceeding. 


4. The FSQRT, FBSTP, and FPREM instructions 
may cause underflow, because they support de- 
normal operands. 


5. The denormal exception can occur during the 
transcendental instructions and the FXTRACT 
instruction. 


6. The denormal exception no longer takes prece- 
dence over all other exceptions. 


7. When the denormal exception is masked, the 
387 SX NPX automatically normalizes denormal 
operands. The 8087/80287 performs unnormal 
arithmetic, which might produce an unnormal re- 
sult. 


8. When the operand is zero, the FXTRACT in- 
struction reports a zero-divide exception and 
leaves — © in ST(1). 


9. The status word has a new bit (SF) that signals 
when invalid-operation exceptions are due to 
stack underflow or overflow. 


10. FLD extended precision no \onger reports denor- 
mal exceptions, because the instruction is not 
numeric. 


11. FLD single/double precision when the operand 
is denormal converts the number to extended 
precision and signals the denormalized operand 
exception. When loading a signalling NaN, FLD 
single/double precision signals an invalid-oper- 
and exception. 


12. The 387 SX NPX only generates quiet NaNs (as 
on the 80287); however, the 387 SX NPX distin- 
guishes between quiet NaNs and_ signaling 
NaNs. Signaling NaNs trigger exceptions when 
they are used as operands; quiet NaNs do not 
(except for FCOM, FIST, and FBSTP which also 
raise IE for quiet NaNs). 


13. When stack overflow occurs during FPTAN and 
overfiow is masked, both ST(0) and ST(1) con- 
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tain quiet NaNs. The 80287/8087 leaves the 
original operand in ST(1) intact. 


14. When the scaling factor is +, the FSCALE 
(ST(O), ST(1)) instruction behaves as _ follows 
(ST(0) and ST(1) contain the scaled and scaling 
operands respectively): 


e FSCALE(0, °°) generates the invalid operation 
exception. 


e FSCALE(finite, — 2) generates zero with the 
same sign as the scaled operand. 


e FSCALE(finite, + °©) generates © with the 
same sign as the scaled operand. 


The 8087/80287 returns zero in the first case 
and raises the invalid-operation exception in the 
other cases. 


15. The 387 SX NPX returns signed infinity/zero as 
the unmasked response to massive overflow/ 
underflow. The 8087 and 80287 support a limit- 
ed range for the scaling factor; within this range 
either massive overflow/underflow do not occur 
or undefined results are produced. 


3.0 HARDWARE INTERFACE 


In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no # is present after 


the signal name, the signal is asserted when at the 


high voltage level. 


3.1 Signal Description 


In the following signal descriptions, the 387 SX NPX 
pins are grouped by function as shown by Table 3-1. 
Table 3-1 lists every pin by its identifier, gives a brief 
description of its function, and lists some of its char- 
acteristics (Refer to Figure 5-1 and Table 5-1 for pin 
configuration). 


INTERFACE SYNCHRONOUS 


ASYNCHRONOUS 


NUMERIC 
CORE 


386™sx cpu 387™sx NPX 


NUMCLK2 
240225-21 


Figure 3.1. Asynchronous Operation 
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Table 3-1. Pin Summary 


Function 


CPUCLK2 


NUMCLK2 NPX CLocK 2 


CKM NPX ClocKing Mode 


RESETIN System reset 


Processor Extension REQuest 


PEREQ 
BUSY # Busy. status 
ERROR # Error status 


Active 
. State 


Input/ Referenced 
Output To... 


Execution Control 
386™ SX Microprocessor CLocK 2 | 


CPUCLK2 


STEN/CPUCLK2 
STEN/CPUCLK2 
STEN/NUMCLK2 


Bus Interface 


D15-—D0 
W/R# 
ADS # 
READY # 
READYO# 


Data pins 
Write/Read bus cycle 
ADdress Strobe 

Bus ready input 
Ready output . 


STatus ENable 

NPX select #1 

NPX select #2 
~CoMmanD 


CPUCLK2 

~ CPUCLK2 
CPUCLK2 
CPUCLK2 
STEN/CPUCLK2 


CPUCLK2 
CPUCLK2 
CPUCLK2 
CPUCLK2 


Power and Ground 


Veo System power 
, Vss System ground. 


All output signals are tristate; they leave floating 
state only when STEN is active. The output buffers 
of the bidirectional data pins D15-—D0 are also tri- 
state; they leave floating state only during cycles 
when the NPX is selected (i.e. when STEN, NPS1 #, 
and NPS2 are all active). 


3.1.1 386™ SX CPU CLOCK 2 (CPUCLK2) 


This input uses the CLK2 signal of the CPU to time 
' the bus control logic. Several other NPX signals are 
referenced to the rising edge of this signal. When 
CKM = 1 (synchronous mode) this pin also clocks 
the data interface and control unit and the floating- 
point unit of the NPX. This pin requires MOS-level 
input. The signal on this pin is divided by two to pro- 
duce the internal clock signal CLK. 


3.1.2 387T™M SX NPX CLOCK 2 (NUMCLK2) 


‘When CKM = 0 (asynchronous mode) this pin pro- 


vides the clock for the data interface and control unit © 


and the floating-point unit of the NPX. In this case, 
the ratio of the frequency of NUMCLK2 to the fre- 
quency of CPUCLK2 must lie within the range 10:16 
to 14:10. When CKM = 1 (synchronous mode) sig- 
nals on this pin are ignored; CPUCLK2 is used in- 
stead for the data interface and control unit and the 
floating-point unit. This pin requires MOS-level input. 


3.1.3 CLOCKING MODE (CKM) 
This pin is a strapping option. When it is strapped to 


~Vec (HIGH), the NPX operates in synchronous 


mode; when strapped to Vss (LOW), the NPX oper- 
ates in asynchronous mode. These modes relate to 
clocking of the data interface and control unit and 
the floating-point unit only; the bus control logic al-. 
ways operates synchronously with respect to the 
CPU. | 


3.1.4 SYSTEM RESET (RESETIN) — 


A LOW to HIGH transition on this pin causes the 
NPX to terminate its present activity and to enter a 
dormant state. RESETIN must remain active (HIGH) 
for at least 40 NUMCLK2 periods. 


The HIGH to LOW transitions of RESETIN must be 
synchronous with CPUCLK2, so that the phase of 


- the internal clock of the bus control logic (which is 


the CPUCLK2 divided by two) is the same as the 
phase of the internal clock of the CPU. After RESE- 
TIN goes LOW, at least 50 NUMCLK2 periods must 
pass before the first NPX instruction is written into 
the NPX. This pin should be connected to the CPU 
RESET pin. Table 3-1 shows the status of the output 


_ pins during the reset sequence. After a reset, all out- 


put pins return to their inactive states. 
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Table 3-2. Output Pin Status during Reset 


[Pinvaive | PinName 


HIGH READYO#, BUSY # 


LOW | PEREQ, ERROR # 


Tri-State OFF D15-—D0 


3.1.5 PROCESSOR EXTENSION REQUEST 
(PEREQ) 


When active, this pin signals to the CPU that the 
NPX is ready for data transfer to/from its data FIFO. 
When ail data is written to or read from the data 
FIFO, PEREQ is deactivated. This signal always 
goes inactive before BUSY # goes inactive. This sig- 
nal is referenced to CPUCLK2. It should be connect- 
ed to the CPU PEREQ input. 


3.1.6 BUSY STATUS (BUSY #) 


When active, this pin signals to the CPU that the 
NPX is currently executing an instruction. This signal 
is referenced to CPUCLK2. It should be connected 
to the CPU BUSY # pin. 


3.1.7 ERROR STATUS (ERROR #) 


This pin reflects the ES bit of the status register. 
When active, it indicates that an unmasked excep- 
tion has occurred. This signal can be changed to 
inactive state only by the following instructions (with- 
out a _ preceding WAIT): FNINIT, FNCLEX, 
FNSTENV, FNSAVE,. FLDCW, FLDENV, and 
FRSTOR. This pin is referenced to CPUCLK2. It 
should be connected to the ERROR# pin of the 
CPU. | 


3.1.8 DATA PINS (D15-D0) 


These bidirectional pins are used to transfer data 
and opcodes between the CPU and NPX. They are 
normally connected directly to the corresponding 
CPU data pins. HIGH state indicates a value of one. 
DO is the least significant data bit. Timings are refer- 
enced to CPUCLK2. 


3.1.9 WRITE/READ BUS CYCLE (W/R#) 


This signal indicates to the NPX whether the CPU 
bus cycle in progress is a read or a write cycle. This 
pin should be connected directly to the CPU’s 
W/R# pin. HIGH indicates a write cycle; LOW a 
read cycle. This input is ignored if any of the signals 
STEN, NPS1 #, or NPS2 is inactive. Setup and hold 
times are referenced to CPUCLK2. 
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3.1.10 ADDRESS STROBE (ADS #) 


This input, in conjunction with the READY # input, 
indicates when the NPX bus-control logic may sam- 
ple W/R# and the chip-select signals. Setup and 
hold times are referenced to CPUCLK2. This pin 
should be connected to the ADS# pin of the CPU. 


3.1.11 BUS READY INPUT (READY #) 


This input indicates to the NPX when a CPU bus 
cycle is to be terminated. It is used by the bus-con- 
trol logic to trace bus activities. Bus cycles can be 
extended indefinitely until terminated by READY #. 
This input should be connected to the same signal 
that drives the CPU’s READY # input. Setup and 
hold times are referenced to CPUCLK2. 


3.1.12 READY OUTPUT (READYO #) 


This pin is activated at such a time that write cycles — 
are terminated after two clocks (except FLDENV 
and FRSTOR) and read cycles after three clocks. In 
configurations where no extra wait states are re- 
quired, this pin must directly or indirectly drive the 
READY # input of the CPU. Refer to the section enti- 
tled “Bus Operation” for details. This pin is activated 
only during bus cycles that select the NPX. This sig- 
nal is referenced to CPUCLK2. 


3.1.13 STATUS ENABLE (STEN) 


This pin serves as a chip select for the NPX. When 
inactive, this pin forces, BUSY#, PEREQ#, ER- 
ROR #, and READYO# outputs into floating state. 
D15-D0 are normally floating; they leave floating 
state only if STEN is active and additional conditions 
are met. STEN also causes the chip to recognize its 
other chip-select inputs. STEN makes it easier to do 
on-board testing (using the overdrive method) of 
other chips in systems containing the NPX. STEN 
should be pulled up with a resistor so that it can be 
pulled down when testing. In boards that do not use 
on-board testing. STEN should be connected to 
Vcc. Setup and hold times are relative to CPUCLK2. — 
Note that STEN must maintain the same setup and 
hold times as NPS1#, NPS2, and CMDO# i.e. if 
STEN changes state during an NPX bus cycle, it 
must change state during the same CLK period as 
the NPS1#, NPS2, and CMDO# signals). 


3.1.14 NPX SELECT 1 (NPS1#) 


When active (along with STEN and NPS2) in the first 
period of a CPU bus cycle, this signal indicates that 
the purpose of the bus cycle is to communicate with 
the NPX. This pin should be connected directly to 
the M/lIO# pin of the CPU, so that the NPX is select- 
ed only when the CPU performs I/O cycles. Setup 
and hold times are referenced to CPUCLK2. | 
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3.1.15 NPX SELECT 2 (NPS2) 
When active (along with STEN and NPS1 #) in the 


ee ee eee | a Tee | a 


first period of a CPU bus CyCie, this signai indicates 
that the purpose of the bus cycle is to communicate 
with the NPX. This pin should be connected directly 
to the A23 pin of the CPU, so that the NPX is select- 
ed only when the CPU issues one of the I/O ad- 
dresses reserved for the NPX (8000F8H, 8000FCH 
or 8000FEH which is treated as 8000FCH by the 
NPX). Setup and hold times are referenced to 
CPUCLKkK2. | 


3.1.16 COMMAND (CMD0O#) 


During a write.cycle, this signal indicates whether an 
opcode (CMDO# active) or data (CMDO# inactive) 
is being sent to the NPX. During a read cycle, it indi- 
cates whether the control or status register (CMDO # 
active) or a data register (CMDO# inactive) is being 
read. CMDO# should be connected directly to the 
A2 output of the CPU. Setup and hold times are ref- 
erenced to CPUCLK2. | - % 
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3.1.17 SYSTEM POWER (Vcc) 
System power provides the +5V DC supply input. 


Aii Vcc pins snouid be tied together on the circuit 
board and local decoupling capacitors should be 
used between Vcc and Vsgs. 


3.1.18 SYSTEM GROUND (Vs) 


All Vgg pins should be tied together on the circuit 


board and local decoupling capacitors should be 


used between Vcc and Vss. 


3.2 System Configuration 


The 387 SX Math Coprocessor is designed to inter- 
face with the 386 SX Microprocessor as shown by 
Figure 3-1. A dedicated communication protocol 
makes possible high-speed transfer of opcodes and 
operands between the CPU and NPX. The 387 SX 
NPX is designed so that no additional components 
are required for interface with the CPU. Most control 
pins of the NPX are connected directly to pins of the 
CPU. | : 


FROM OTHER PERIPHERALS 
— 


387"sx NPX CLOCK 


CLOCK 
. GENERATOR 


LK2 | 
RESET 


GENERATOR 
(OPTIONAL) READYO# 
HLDA 
386'™sx cpu 
RESET D/c# 


READY# ——_LOCK# 
CLK2 BHE#, BLE¥ 


387'Msx NPX 


NA¢ 
HOLD | 
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Figure 3-2. 386™ SX CPU and 387™ SX NPX System Configuration 
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The interface between the NPX and the CPU has 
these characteristics: 


e The NPX shares the local bus of the 386 SX Mi- 
croprocessor. 


e The CPU and NPX share the same reset signals. 
They may also share the same clock input; how- 
ever, for greatest performance, an external oscil- 
lator may be needed. 


e The corresponding BUSY #, ERROR #, and PER- 
EQ pins are connected together. 


e The NPX NPS1# and NPS2 inputs are connect- 
ed to the latched CPU M/IO# and A23 outputs 
respectively. For coprocessor cycles, M/IO# is 
always LOW and A23 always HIGH. 


e The NPX input CMDO is connected to the latched 
Ao output. The 386 SX Microprocessor generates 
address 8000F8H when writing a command and 
address 8000FCH or 8000FEH (treated as 
8000FCH by the 387 SX NPX) when writing or 
reading data. It does not generate any other ad- 
dresses during NPX bus cycles. 


3.3 Processor Architecture 


As shown by the block diagram on the front page, 
the 387 SX NPX is internally divided into three sec- 
tions: the bus control logic (BCL), the data interface 


and control unit, and the floating point unit (FPU). | 


The FPU (with the support of the control unit which 
contains the sequencer and other support units) ex- 
ecutes all numerics instructions. The data interface 
and control unit is responsible for the data flow to 
and from the FPU and the control registers, for re- 
ceiving the instructions, decoding them, and se- 
quencing the microinstructions, and for handling 
some of the administrative instructions. The BCL is 
responsible for CPU bus tracking and interface. The 
BCL is the only unit in the NPX that must run syn- 
chronously with the CPU; the rest of the NPX can 
run asynchronously with respect to the CPU. 


3.3.1 BUS CONTROL LOGIC 
The BCL communicates solely with the CPU using 


' I/O bus cycles. The BCL appears to the CPU asa 
special peripheral device. It is special in two re- 
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spects: the CPU initiates |/O automatically when it 
encounters ESC instructions, and the CPU uses re- 
served I/O addresses to communicate with the BCL. - 
The BCL does not communicate directly with memo- 
ry. The CPU performs all memory access, transfer- 
ring input operands from memory to the NPX and 
transferring outputs from the NPX to memory. 


3.3.2 DATA INTERFACE AND CONTROL UNIT 


The data interface and control unit latches the data 
and, subject to BCL control, directs the data to the 
FIFO or the instruction decoder. The instruction de- 
coder decodes the ESC instructions sent to it by the 
CPU and generates controls that direct the data flow 
in the FIFO. It also triggers the microinstruction se- 
quencer that controls execution of each instruction. 
lf the ESC instruction is FINIT, FCLEX, FSTSW, 
FSTSW AX, FSTCW, FSETPM, or FRSTPM, the 
control executes it independently of the FPU and the 
sequencer. The data interface and control unit is the 
one that generates the BUSY #, PEREQ, and ER- 
ROR# signals that synchronize NPX activities with 
the CPU. 


3.3.3 FLOATING-POINT UNIT 


The FPU executes all instructions that involve the 
register stack, including arithmetic, logical, transcen- 
dental, constant, and data transfer instructions. The 
data path in the FPU is 84 bits wide (68 significant 
bits, 15 exponent bits, and a sign bit) which allows 
internal operand transfers to be performed at very 
high speeds. 


3.4 Bus Cycles 


The pins STEN, NPS1#, NPS2, CMDO, and W/R# 
identify bus cycles for the NPX. Table 3-3 defines 
the types of NPX bus cycles. 


3.4.1 387T™™M SX NPX ADDRESSING 


The NPS1#, NPS2, and CMDO signals allow the 


NPX to identify which bus cycles are intended for the 
NPX. The NPX responds to I/O cycles when the I/O 
address is 8000F8H, 8000FCH or 8000FEH (treated 


Table 3-3. Bus Cycle Definition 


STEN | NPS1#4 | CMDO# 


| Bus Cycle Type 


NPX not selected and all outputs in floating state 
NPX not selected 

NPX not selected 

CW or SW read from NPX 
Opcode write to NPX 
Data read from NPX 

Data write to NPX 
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as 8000FCH by the 387 SX NPX). The NPX re- 
sponds to I/O cycles when bit 23 of the |1/O address 


is set. In other words, the NPX acts as an 1/0 device 
ina reserved 1/O address snace. 


Because A23 i is used to select the 387 SX Numerics 
Coprocessor Extension for data transfers, it is not 
possible for a program running on the CPU to ad- 
dress the NPX with an I/O instruction. Only ESC in- 
structions cause the CPU fo communicate with the 
NPX. : 


3.4.2 CPU/NPX SYNCHRONIZATION 


The pins BUSY #, PEREQ, and ERROR# are used 
for various aspects of synchronization between the 
CPU and the NPX. 


BUSY # is used to synchronize instruction transfer | 


from the CPU to the NPX. When the NPX recognizes 
an ESC instruction, it asserts BUSY #. For most ESC 
instructions, the CPU waits for the NPX to deassert 
BUSY # before sending the new opcode. 


The NPX uses the PEREQ pin of the CPU to signal 
that the NPX is ready for data transfer to or from its 
data FIFO. The NPX does not directly access mem- 
ory; rather, the CPU provides memory access serv- 
ices for the NPX. (For this reason, memory access 
on behalf of the NPX always obeys the protection 
rules applicable to the current CPU mode.) Once the 
CPU initiates an NPX instruction that has operands, 
the CPU waits for PEREQ signals that indicate when 
the NPX is ready for operand transfer. Once all oper- 
ands have been transferred (or if the instruction has 
no operands) the CPU continues program execution 
while the NPX executes the ESC instruction. 


In 8086/8087 systems, WAIT instructions may be 
required to achieve synchronization of both com- 
mands and operands. In the 386 SX Microprocessor 
and 387 SX Math Coprocessor systems, however, 
WAIT instructions are required only for operand syn- 
chronization; namely, after NPX stores to memory 
(except FSTSW and FSTCW) or load from memory. 
(In 80286/80287 systems, WAIT is required before 
FLDENV and FRSTOR; with the 386 SX Microproc- 
essor and 387 SX Math Coprocessor, WAIT is not 
required in these cases.) Used this way, WAIT en- 
sures that the value has already been written or read 


by the NPX before the CPU reads or changes the 


value. 


Once it has started to-.execute a numerics instruction 
and has transferred the operands from the CPU, the 
NPX can process the instruction in parallel with and 
independent of the host CPU. When the NPX de- 
tects an exception, it asserts the ERROR# signal, 
which causes a CPU interrupt. 
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3.4.3 SYNCHRONOUS OR ASYNCHRONOUS 
MODES 


The internal logic of the NPX (the FPU) can operate 
either directly from the CPU clock (synchronous 
mode) or from a separate clock (asynchronous 
mode). The two configurations are distinguished by 
the CKM pin. In either case, the bus control logic 
(BCL) of the NPX is synchronized with the CPU 
clock. Use of asynchronous mode allows the CPU 
and the FPU section of the NPX to run at different 
speeds. In this case, the ratio of the frequency of 


~ NUMCLK2 to the frequency of CPUCLK2 must lie 


within the range 10:16 to 14:10. Use of synchronous 
mode eliminates one clock generator from the board | 
design. 


3.4.4 AUTOMATIC BUS CYCLE TERMINATION 


In configurations where no extra wait states are re- 
quired, READYO# can drive the CPU’s READY # 
input. If this pin is used, it should be connected to 
the logic that ORs all READY outputs from peripher- 
als on the CPU bus. READYO# is asserted by the 
NPX only during I/O cycles that select the NPX. Re- 
fer to Section 4.0 ‘Bus Operation” for details. 


4.0 BUS OPERATION 


_ With respect to bus interface, the 387 SX NPX is 
fully synchronous with the CPU. Both operate at the 


same rate, because each generates its internal CLK 
signal by dividing CPUCLK2 by two. Furthermore, 
both internal CLK signals are in phase, because they 
are aynenronizes by ine same RESETIN signal. 


A bus cycle for the NPX starts when the CPU acti- 
vates ADS# and drives new values on the address 
and cycle-definition lines. The NPX examines the ad- 
dress and cycle-definition lines in the same CLK pe- 
riod during which ADS # is activated. This CLK peri- 
od is considered the first CLK of the bus cycle. 
During this first CLK period, the NPX also examines 
the R/W# input signal to determine whether the cy- 
cle is a read or a write cycle and examines the 
CMDO0 input to determine whether an opcode, oper- 
and, or control/status register transfer is to occur. 


The 387 SX NPX supports both pipelined (i.e. over- 
lapped) and nonpipelined bus cycles. A nonpipelined 
cycle is one for which the CPU asserts ADS # when 
no other NPX bus cycle is in progress. A pipelined 


_ bus cycle is one for which the CPU asserts ADS # 


and provides valid next-address and control signals 
before the prior NPX cycle terminates. The CPU may 
do this as early as the second CLK period after as- 
serting ADS # for the prior cycle. Pipelining increas- 
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es the availability of the bus by at least one CLK 
period. The 387 SX NPX supports pipelined bus cy- 
cles in order to optimize address pipelining by the 
CPU for memory cycles. 


Bus operation is described in terms of an abstract 
state machine. Figure 4-1 illustrates the states and 
state transitions for NPX bus cycles: 


e T; is the idle state. This is the state of the bus 
logic after RESET, the state to which bus logic 
returns after every nonpipelined bus cycle, and 


the state to which bus logic returns after a series 


of pipelined cycles. 


* Trs is the READY #-sensitive state. Different 
types of bus cycles may require a minimum of 
one or two successive Trs states. The bus logic 
remains in Trs state until READY #is sensed, at 
which point the bus cycle terminates. Any number 
of wait states may be implemented by delaying 
READY #, thereby causing additional successive 
Trs States. 


© Tp is the first state for every pipelined bus cycle. 
This state is not used by nonpipelined cycles. 


Note that the bus logic tracks bus state regardless 
of the values on the chip/port select pins. 


The READYO# output of the NPX indicates when 
an NPX bus cycle may be terminated if no extra wait 
states are required. For all write cycles (except 
those for the instructions FLDENV and FRSTOR), 
READYO# is always asserted during the first Tps 


state, regardless of the number of wait states. For all 


read cycles and write cycles for FLDENV and 


READY * ADS 


"ALWAYS" 
READY * ADS# ; 


-READY# 
240225-6 


Figure 4-1. Bus State Diagram 
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FRSTOR, READYO # is always asserted in the sec- 
ond Trs state, regardless of the number of wait 
states. These rules apply to both pipelined and non- 
pipelined cycles. Systems designers may use 
READYO# in one of the following ways: 


1. Connect it (directly or through logic that ORs 
READY # signals from other devices) to the 
READY # inputs of the CPU and NPX. 


2. Use it as one input to a wait-state generator. 


The following sections illustrate different types of 
387 SX NPX bus cycles. Because different instruc- 
tions have different amounts of overhead before, be- 
tween, and after operand transfer cycles, it is not 
possible to represent in a few diagrams all of the 
combinations of successive operand transfer cycles. 
The following bus-cycle diagrams show memory cy- 
cles between NPX operand-transfer cycles. Note 
however that, during FRSTOR, some consecutive 
accesses to the NPX do not have intervening memo- 
ry accesses. For the timing relationship between op- 
erand transfer cycles and opcode write or other 
overhead activities, see the figure “Other Parame- 
ters” in section 6. | 


4.1 Nonpipelined Bus Cycles 


Figure 4-2 illustrates bus activity for consecutive 
nonpipelined bus cycles. 


At the second clock of the bus cycle, the NPX enters 
the Trs state. During this state, it samples the 
READY # input and stays in this state as long as 
READY # is inactive. 


4.1.1 WRITE CYCLE 


In write cycles, the NPX drives the READYO# signal 
for one CLK period during the second CLK period of 
the cycle (i.e. the first TrRs state); therefore, the fast- 
est write cycle takes two CLK periods (see cycle 2 of 
Figure 4-2). For the instructions FLDENV and 
FRSTOR, however, the NPX forces a wait state by 
delaying the activation of READYO# to the second 
Trs state (not shown in Figure 4-2). 


The NPX samples the D15-D0 inputs into data 
latches at the falling edge of CLK as long as it stays 
in Trs state. 


When READY # is asserted, the NPX returns to the 
idle state. Simultaneously with the NPX’s entering 
the idle state, the CPU may assert ADS # again, sig- 
naling the beginning of yet another cycle. 
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Cycles 1 & 2 represent part of the operand transfer cycle for instructions involving éithiat 4- -byte or 8- -byte operand loads. 
Cycles 3 & 4 represent part of the operand transfer cycle for a store operation. _ 
*Cycles 1 & 2 could repeat here or T; states for various non-operand transfer cycles and overhead. 


sore | BOER | BETTS 


MA 


Ry OW x 


DAWA | 


paws 4-2. Nonpipelined Read and Write Cycles 


4.1.2 READ CYCLE 
At the rising edge of CLK in the second CLK period 


of the cycle (i.e. the first Tag state), the NPX starts 


to drive the D15—D0 ‘outputs and continues to drive 
them as long as it stays in Trs state. 


At least one wait state must be inserted to ensure 


that the CPU latches the correct data. Because the 
NPX starts driving the data bus only at the rising 
edge of CLK in the second clock period of the bus 
cycle, not enough time is left for the data signals to 
propagate and be latched by the CPU before the 
next falling edge of CLK. Therefore, the NPX does 
not drive the READYO# signal until the third CLK 
period of the cycle. Thus, if the READYO# output 
drives the CPU’s READY # input, one wait state is 
automatically inserted. 


Because one wait state is required for NPX reads, 
the minimum length of an NPX read cycle is three 
CLK periods, as cycle 3 of Figure 4-2 shows. 


When READY # is asserted, the NPX returns to the 
idle state. Simultaneously with the NPX’s entering 
the idle state, the CPU may assert ADS # again, sig- 


naling the beginning of yet another cycle. The tran- 


sition from Trs state to idle state causes the NPX to 
put the tristate D15-D0 outputs into the floating 
state, allowing another device to drive the data bus. 


4.2 Pipelined Bus Cycles 


Because all the activities of the NPX bus interface 


occur either during the Trg state or during the tran- 
sitions to or from that state, the only difference be- 
tween a pipelined and a nonpipelined cycle is the 
manner of changing from one state to another. The 
exact activities during each state are detailed in the 
previous section “Nonpipelined Bus Cycles”. 


When the CPU asserts ADS# before the end of a 
bus cycle, both ADS# and READY # are active dur- 
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Cycle 1-Cycle 4 represent the operand transfer cycle for an instruction involving a transfer of two 32-bit loads in total. 
The opcode write cycles and other overhead are not shown. 
Note that the next cycle will be a pipelined cycle if both READY# and ADS# are sampled active at the end of a Tras 


state of the current cycle. 


Figure 4-3. Fastest Transitions to and from Pipelined Cycles 


ing a Trg state. This condition causes the NPX to 
change to a different state named Tp. One clock 
period after a Tp state, the NPX always returns to 
Trs state. In consecutive pipelined cycles, the NPX 
bus logic uses only the Trs and Tp states. 


Figure 4-3 shows the fastest transitions into and out 
of the pipelined bus cycles. Cycle 1 in the figure rep- 
resents a nonpipelined cycle. (Nonpipelined write 
cycles with only one Trg state (i.e. no wait states) 
are always followed by another nonpipelined cycle, 
because READY # is asserted before the earliest 
possible assertion of ADS# for the next cycle.) 


Figure 4-4 shows pipelined write and read cycles 
with one additional Trs state beyond the minimum 
required. To delay the assertion of READY# re- 
quires external logic. 


4.3 Bus Cycles of Mixed Type 


When the NPX bus logic is in the Trg state, it distin- 
guishes between nonpipelined and pipelined cycles 
according to the behavior of ADS# and READY #. 
In a nonpipelined cycle, only READY # is activated, 
and the transition is from Trs state to idle state. Ina 
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1. we between operand write to the NPX and storing result. 


Figure 4-4. Pipelined Cycles with Wait States 


pipelined cycle, both READY# and ADS# are ac- 


tive, and the transition is first from Trs state to Tp 
state, then, after one ici period, back to Trs 
state. 


4.4 BUSY # and PEREQ Timing 
Relationship | 


Figure 4-5 shows the activation of BUSY# at the 
beginning of instruction execution and its deactiva- 


tion upon completion of the instruction. PEREQ is 
activated within this interval. If ERROR# (not shown 
in the figure) is ever asserted, it would be asserted at 
least six CPUCLK2 periods after the deactivation of 
PEREQ and would be deasserted at least six 
CPUCLK2 periods before the deactivation of 
BUSY #. Figure 4-5 also shows that STEN is activat- 
ed at the beginning of an NPX bus cycle. 
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2. PEREQ is an asynchronous input to the 386™ Microprocessor; it may not be asserted (instruction dependent). 


3. More operand transfers. 
4. Memory read (operand) cycle is not shown. 


Figure 4-5. STEN, BUSY #, and PEREQ Timing Relationships 


5.0 PACKAGE THERMAL 
SPECIFICATIONS 


The 387 SX Math Coprocessor is specified for oper- 
ation when case temperature is within the range of 
0°C-100°C. The case temperature may be mea- 
sured in any environment, to determine whether the 
387 SX Math Coprocessor is within specified operat- 
ing range. The case temperature should be mea- 
sured at the center of the top surface opposite the 
pins. 


The ambient temperature is guaranteed as long as 
T, is not violated. The ambient temperature can be 
calculated from the 6j, and 6j4 from the following 
equations: 


Tj = Te ai P * Gic 
Ta = Tj oe P . Gia 
Te — Ta + Pp* [Oia =~ Gicl 


Values for 6;, and 6), are given in Table 5-1 for the 
68-pin picc. 8jq is given at various airflows. Table 
5-2 shows the maximum Tg allowable (without ex- 
ceeding T,) at various airflows. Note that Tg can be 
improved further by attaching ‘fins’ or a ‘heat sink’ to 
the package. P is calculated by using the maximum 
hot Icc. 


Table 5-1. Thermal Resistances ('C/Watt) 6j¢ and 6ja 


Package ae 
ae 


| 68-Pin PLCC 


ja versus Airflow - ft/min (m/sec) 


400 600 800 1000 
(2.03) (3.04) (4.06) (5.07) 


a | » | ws | « | 2 
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Table 5-2. Maximum Ta, at Various Airflows 


| Ta(°C) versus Airflow - ft/min (m/sec) 


pecnege o- 200 | 400 600 800 | 1000 
| (0) (1 dl (2.03) - (3.04) (4. 06) — (5.07) 


Max. Ta calculated at Max Tee and Max Icc. 


Figure 5-1 shows the locations of pins on the chip peesnge: Table 5-3 helps to locate pin identifiers in 
Figure 5-1. 


17161514131211109 87654321 


387™sx Math Coprocessor 


top view 
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The term “top view” means ‘fas viewed when mounted in a printed-circuit board’. 


Figure 5-1. PLCC Pin Configuration 


Table 5-3. Pin Cross-Reference = = | 
18 —n.c. 35 — ERROR# : 52 —n.c. a 


19 — DOO 


. 20 — D01 


21—Vss 
22— Vcc 
23 — DO02 
24 — DO08 
25 —Vss 


26— Vcc 
27—Vss 
28 — DO9 


_ 29—D10 


30 — D11 
31—Vocc 


~382—Vss 


33 — Voc 
34—Vss 


36 — BUSY # 
37— Voc 

38 — Vss 
39 — Vcc 

40 — STEN 

41 —W/R# | 
43—Vcc 

44 — NPS1# 
45 — NPS2 
46—Voc 

47 — ADS# 

48 — CMDO# 
49 — READY # 
50 — Voc 

51 — RESETIN 


n.c.—The corresponding pins of the 387™ SX NPX are left unconnected. 
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53 — NUMCLK2 


— 64— CPUCLK2 


55 —Vss - 


56 — PEREQ 


57 — READYO# 
58 — Voc. 
59 — CKM 
60 — Vss 
61—Vss 
62— Vcc 
63—Vss 
64 — Vcc 
65 — n.c. 
66 — Vss 
67 — Nn.c. 
68 — n.c. 
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6.0 ELECTRICAL DATA 


6.1 Absolute Maximum Ratings 


NOTE: 
Stresses above those listed may cause permanent 
damage to the device. This is a stress rating only 
and functional operation of the device at these or 
any other conditions above those indicated in the 


6.2 D.C. Characteristics 


operational sections of this specification is not im- 
plied. Exposure to absolute maximum rating condi- 
tions for extended periods may affect device reliabili- 


ty. 


Case temperature Tc under bias ...... 0°C to 100°C 
Storage temperature ........... —65°C to + 150°C 
Voltage on any pin with respect to ground — 0.5 to 
Voc + 0.5V 

Power dissipation .............. 00 eee eeee 1.5 Watt 


Table 6-1. D.C. Specifications Tc = 0° to 100°C, Vcc = 5V + 10% 


Symbot [Parameter [Min [Max | Units 


Input LO Voltage 
Input HI Voltage 
CPUCLK2 and NUMCLK2 
Input LO Voltage 
CPUCLK2 and NUMCLK2 
Input HI Voltage 
Output LO Voltage 
Output HI Voltage 
} Output HI Voltage 
~ Power Supply Current: 
NUMCLK2 = 40 MHz(5) 
NUMCLK2 = 32 MHz(5) 
NUMCLK2 = 2 MHz(5) 
Input Leakage Current 
|/O Leakage Current 
Input Capacitance 
1/O or Output Capacitance 
Clock Capacitance 


NOTES: 


1. This parameter is for all inputs, excluding the clock inputs. 


2. This parameter is measured at Io, as follows: 
data = 4.0mA 7 
READYO#, ERROR#, BUSY #, PEREQ = 2.5mA 
. This parameter is measured at Ioy as follows: 
data = 1.0mA . 
READYO#, ERROR#, BUSY #, PEREQ = 0.6mA 
4. This parameter is measured at Io} as follows: 
data = 0.2mA 
READYO#, ERROR#, BUSY #, PEREQ = 0.12mA 


ive] 


Test Conditions 


See note 1 
See note 1 


See note 2 
See note 3 
See note 4 


loc. typ. = 200 mA 
loc typ. = 150 mA 


OV < Vin < Voc 
0.45V < Vo < Vcc 
fe = 1MHz 

= 1MHz 

= 1MHz 


5. Ioc is measured at steady state, maximum capacitive loading on the outputs, and worst-case D.C. level at the inputs; 


CPUCLK2 at the same frequency as NUMCLK2. 


6.3 A.C. Characteristics © 


Table 6-2a. Combinations of Bus interface and Execution Speeds 


Bus Interface Unit (MHz) 
Execution Unit (MHz) . 


Functional Block 80387SX-16 80387SX-20 


16 | 
16 


20 
200 
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Refer to 
Figure 


NUMCLK2 | Period 2.0V 
NUMCLK2 High Time 2.0V 
NUMCLK2 High Time Voc —0.8V 
NUMCLK2 Low Time | ~ | 2.0V - 
NUMCLK2 | Low Time 0.8V 
NUMCLK2 - Fall Time From Voc — 0.8 to 0.8V 
NUMCLK2 | » Rise Time From 0.8 to Voc — 0.8V 


NOTE: | 
1. If not used (CKM = 1), tie LOW. 


Table 6-2c. Timing Requirements of Bus Interface Unit Tc = 0° to 100° C, Voc = 5V + 10% 


| 16 MHz (1.5V) | 20 MHz (1.5V) | 
Parameter Min 
(ns) 


Period 500 25 500 | 2.0V 

High Time 7 6 2.0V : 

High Time | | 3 Voc —0.8V 

Low Time 6 |. 2.0V | 

Low Time 4 0.8V 

Fall Time 8 8 | From Vcc — 0.8 to 0.8V 
Rise Time 8 8 | From 0.8 to Voc —0.8V 


Test Refer to 


Conditions | 


CPUCLK2 
CPUCLK2 
CPUCLK2 
CPUCLK2 
CPUCLK2 
CPUCLK2 


CPUCLK2 _ 

CPUCLK2/ 10/16 | 14/10 | 10/16 | 14/10 

NUMCLK2 -—— 

READYO# | t7 Out Delay 


READYO# 
PEREQ 
BUSY # 
ERROR# ~ 


Out Delay 
Out Delay 
Out Delay 
Out Delay 


Out Delay 
Setup Time 
Hold Time 

Float Time 


Float Time 
Float Time 
Float Time 
Float Time 


Setup Time 
Hold Time 

Setup Time 
Hold Time 


DM WON ® 
-OoOAhNnN — 


awh anh 


awd, anh 
Ora a|Naaan 


34 
31 
34 
34 
34 

54 

33 


PEREQ 
BUSY # 
ERROR # 
READYO# 


wh eh eh meh |) mh mh ot 


DDD HD 
oo koko) 


a 
LhOLHO;A+a- aA 
abet] 
A | AR 


READY # Setup Time 19 12 
READY # Hold Time 4 4 
CMDO # Setup Time 21 19 
CMDO # Hold Time 2 2 
| NPSi #, Setup Time 21 19 


NPS2 
NPS1 #, Hold Time 2 2 
NPS2 
STEN Setup Time 21 21 
STEN Hold Time 


‘TRO 
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Table 6-2c. Timing Requirements of Bus Interface Unit Tc = 0° to 100° C, Voc = 5V + 10% (Continued) 


~. win (1.5V) - MHz (1.5V) 
Symbol! | Parameter 


Max 
E-% (ns) ins} 
RESETIN | t18 Setup Time i _ 6.5 
RESETIN t19 Hold Time 
NOTES: 


*Float condition occurs when maximum output current becomes less than |, 9 in magnitude. Float delay is not tested. 
**Not tested at 25 pf. 


Refer to 
Figure 


Test 
Conditions 


READYO#, PEREQ, 
BUSY#, ERROR# J 
Typical * Output 


Delay (ns) 
@ 1.5V 


. Load Capacitance, C, (pf 
NOTES: s u (Pf) 240225-12 


Graphs are not linear outside the CL range shown. 
nom = nominal value given in the AC timing table 
*Typical part under worst-case conditions 


Figure 6-1a. Typical Output Valid Delay vs. Load Capacitance at Max Operating Temperature 


Typical * Output - a = | Typical* Output 18 
— ds 


0 75 100 125 15 


Load Capacitance, C; (pf) Load Capacitance, C (pf) 
NOTES: 240225-13 
Graphs are not linear outside the C, range shown. 
*Typical part under worst-case conditions 


Figure 6-1b. Typical Output Slew Time vs. Load Capacitance at Max Operating Temperature 
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FREQUENCY (MHz) nee 


Graphs are not linear outside the frequency range shown. 


VIH/VCH (MIN) 
CPUCLK2/NUMCLK2 


2.0V 


OUTPUTS _ 


Figure 6-2. CPUCLK2/NUMCLK2 Waveform and Measurement Points for Input/Output 


- VIH/VCH VIH/VCH (MIN) 


(MIN) 
top 
2.0V 


Tq 


- MIN DELAY 


-— MAX DELAY . 
240225~-15 
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(CLK) (PH2) 


CPUCLK2 


240225-16 


(PH1) (PH2) (PH1) 


CPUCLK2 


tis—>| 


W/R# 


. ? t 
ee STEN. te 


STEN, 


CMDO# 
a) nae deta 
| | | A | 


READY# 
| cae eee 


D15=D0 
(INPUT) 


D15=DO 
(OUTPUT) 


240225-17 


Figure 6-4. Input and I/O Signals 
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(CLK) 7 / (PH1 or PH2) \ (PHI or PH2) = (PH2) 
ey ae ey 


CPUCLK2 


RESET BANANA 


240225-18 


NOTE: . 
The second internal processor phase following RESET high to low transition is PH2. 


Figure 6-5. RESET Signal 


‘STEN \ / 


a ta bone | 
D15=-D0, PEREQ Se 


BUSY#, ERROR#, READYO# 


240225-19 


Figure 6-6. Float from STEN 


Table 6-3. Other Parameters 


Min | 


NUMCLK2 


me | Perce 
«| crucuKe 


Duration | 


ERROR # (in)active to BUSY # inactive | 6 _ 


| Max | 

, Ea 
RESETIN inactive to 1stopcode write | 50 | | NUMCLK2 

ae 

a 


CPUCLK2 
CPUCLK2 


READY # Minimum time from operand write to 
operand write 
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¢% 
2ND OPERAND 
WRITE (NOTE 1) 


157 OPERAND 


157 OPCODE 
NOTE 1 WRITE 


WRITE 


CPUCLK2 I] | | 
- al el d all 
[to] tz; tz6 t37 t34 t33 
fe ay | 
, ae a i 
a = S | 
BUSY# ae 
a es 
is fe 
ea a | 
ERROR# i 
t35 ts 
240225-20 


* In NUMCLK2’s 
** or last operand 


| NOTE: 
1. Memory read (operand) cycle is not shown. 


Figure 6-7. Other Parameters 


7.0 387™ SX NPX EXTENSIONS TO 
-THE CPU’S INSTRUCTION SET 


Instructions for the 387 SX NPX assume one of the 
five forms shown in Table 7-1. In all cases, instruc- 
tions are at least two bytes long and begin with the 
bit pattern 11011B, which identifies the ESCAPE 
class of instruction. Instructions that refer to memory 
operands specify addresses using the CPU’s ad- 
dressing modes. 


MOD (Mode field) and R/M (Register/Memory spec- 
ifier) have the same interpretation as the corre- 
sponding fields of CPU instructions (refer to Pro- 
grammer’s Reference Manual for the CPU). SIB 


(Scale Index Base) byte and DISP (displacement) 
are optionally present in instructions that have MOD 
and R/M fields. Their presence depends on the val- 
ues of MOD and R/M, as for instructions of the CPU. 


The instruction summaries that follow assume that 
the instruction has been prefetched, decoded, and is 
ready for execution; that bus cycles do not require 
wait states; that there are no local bus HOLD re- 
quests delaying processor access to the bus; and 
that no exceptions are detected during instruction 
execution. If the instruction has MOD and R/M fields 
that call for both base and index registers, add one 
clock. — . 
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= | | Table 7-1. Instruction Formats | 
Optional. 
| torr | MF | opa | MoD | opps | A/M | SiB | DISP | 
OPA 


[or 
15-11 10 9 8 7 6 5 43 2 1 0 
OP = Instruction opcode, possibly split into two fields OPA and OPB 
MF = Memory Format 
. 00—32-bit real 
01—32-bit integer 
10—64-bit real 
11—16-bit integer 
= Destination 
0—Destination is ST(0) 
1—Destination is ST(i) 
R XOR d = 0—Destination (op) Source 
R XOR d = 1—Source (op) Destination 
*In FSUB and FDIV, the low-order bit of OPB is the R (reversed) bit 
P = POP . | ; 
0—Do not pop stack 
1—Pop stack after operation 
ESC = 11011 
ST(i) = Register stack element i 
000 = Stack top 
001 = Second stack element 


oh OD = 


111 = Eighth stack element 
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387™ SX NPX Extension to the 386™ SX Microprocessor Instruction Set 


Po Encoding Clock Count Range 
Byte Optional 32-Bit 32-Bit 64-Bit 16-Bit 
0 Bytes 2-6 Real Integer Real Integer 


DATA TRANSFER 
FLD = Loada 


Integer/real memory to ST(0) ESC MF 1 MOD 000 R/M SIB/DISP 
Long integer memory to ST(0) ESC 111 MOD 101 R/M SIB/DISP 
Extended real memory to ST(0) ESC 011 MOD 101 R/M SIB/DISP 


52 


BCD memory to ST(0) MOD 100 R/M SIB/DISP_ | 274-283 
ST(i) to ST(0) ESC 001 11000 ST(i) 14 


FST = Store 
ST(0) to integer/real memory ESC MF 1 MOD 010 R/M SIB/DISP 
ST(0) to ST(i) ESC101. |  11010ST(i) : 


FSTP = Store and Pop 


ST(0) to integer/real memory ESC MF 1 MOD 011 R/M SIB/DISP 
ST(0) to long integer memory ESC 111 MOD 111 R/M | SIB/DISP 


ST(0) to extended real ESC 011 MOD 111 R/M SIB/DISP 63 
ST(0) to BCD memory ESC 111 MOD 110 R/M SIB/DISP : 522-544 


STO)IOST 12 
FXCH = Exchange 

ST(i) and ST(0) 
COMPARISON 
FCOM = Compare . 

Integer/real memory to ST(0) 

ST9 ST 


FCOMP = Compare and pop 
Integer/real memory to ST ESCMFO | MODO11R/M | SIB/DISP 
ST(i) to ST(0) |___ESC000 11011 ST(i) 


FCOMPP = Compare and pop twice 


ST(1) to ST(0) ESC 110 ~ 4101 1004 
FTST = Test ST(0) ESC 001 1110 0100 


FXAM = Examine ST(0) ESC 001 11100101 
CONSTANTS 


FLDZ = Load +0.0 into ST(0) 
FLD1 = Load +1.0 into ST(0) 
FLDPI = Load pi into ST(0) 
FLDL2T = Load logo(10) into ST(0) 


Shaded areas indicate instructions not available in 8087/80287. 


NOTE: 
a. When loading single- or double-precision zero from memory, add 5 clocks. 
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387™ SX NPX Extension to the 386™ SX Microprocessor Instruction Set (Continued) 


te | Encoding — _ Clock Count Range 


Optional 32-Bit 64-Bit 16-Bit 
Bytes 2-6 integer Real Integer 


| Instruction 


CONSTANTS (Continued) | 
FLDL2E = Load logg(e) into ST(0) ESC 001 1110 1010 


FLDLG2 = Load log19(2) into ST(0) ESC 001 11101100 | 
FLDLN2 = Load loge(2) into ST(0) ESC 001 


ARITHMETIC 


FADD = Add | 
Integer/real memory with ST(0) ESC MF 0 MOD 000 R/M SIB/DISP 61-76 37-45 71-85 
ST(i) and ST(0) ESC dP0O 11000 ST(i) 23-31) 


FSUB = Subtract 


Integer/real memory with ST(0) ESC MF 0 MOD 10RR/M | _ SIB/DISP 
ST(i) and ST(0) | ESCdPO | 1110RR/M 


FMUL = Multiply 


Integer/real memory with ST(0) ESC MFO MOD 001 R/M SIB/DISP 


28-36 61-76 36-44 71-83¢ 
26-34d . 


65-86 40-65 76-87 


ST(i) and ST(0) 29-57¢ 

FDIV = Divide 
Integer/real memory with ST(0) 93 124-131f 102 136-1409 
Sr and ST) | se 

FSQRT! = Square root [Escoot | 11111010 | 122-129 


FSCALE = Scale ST(0) by ST(1) ESC 001 1111 1101 
FPREM = Partial remainder ESC 001 1111 1000 


ESC 001 1111 1100 
FXTRACT = Extract components 


of ST(0) ESC 001 1111 0100 
FABS = Absolute value of ST(0) ESC 001 1110 0001 
FCHS = Change sign of ST(0) ESC 001 1110 0000 


Shaded areas indicate instructions not available in 8087/80287. 


NOTES: 

b. Add 3 clocks to the range when d = 1. 

c. Add 1 clock to each range when R = 1. 

d. Add 3 clocks to the range whend = 0._ 

e. typical = 52 (When d = 0, 46-54, typical = 49). 
f. Add 1 clock to the range when R = 1. 

g. 135-141 when R = 1. 

h. Add 3 clocks to the range when d = 1. 

i. —O < ST(O) < +0. 


67-86 


74-155 


FRNDINT = Round ST(0) 
to integer 
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387™ SX NPX Extension to the 386T™ SX Microprocessor Instruction Set (Continued) 


Encoding 


Byte Byte Optional Clock Count Range 
0 Bytes 2-6 


TRANSCENDENTAL 


FPTANK = Partial tangent of ST(0) ESC 001 1111 0010 191-497) 
FPATAN = Partial arctangent ESC 001 314-487 


F2XMi1! = 2ST(0) — 4 ESC 001 1111 0000 ; . 211-476 


FYL2XM = ST(1) * logo(ST(0)) 120-538 
FYL2XP1n = ST(1) * logo(ST(0) + 1.0) 257-547 
PROCESSOR CONTROL | 

FINIT = Initialize NPX 33 
FSTSW AX = Store status word _ 13 
FLDCW = Load control word 19 
FSTCW = Store control word 15 
FSTSW = Store status word 15 
FCLEX = Clear exceptions . 1 
FSTENV = Store environment 103-104 
FLDENV = Load environment | 71 
FSAVE = Save state 475-476 
FRSTOR = Restore state 388 
FINCSTP = increment stack pointer ; 21 
FDECSTP = Decrement stack pointer. 22 
FFREE = Free ST(i) 18 
FNOP = No operations 12 
Shaded areas indicate instructions not available in 8087/80287. 

NOTES: 


j. These timings hold for operands in the range |x| < 2/4. For operands not in this range, up to 76 additional clocks may be 
needed to reduce the operand. 

k. 0 <|ST(0)| < 263, 

l. —1.0 < ST(O) < 1.0. 

m. 0 < ST(0) < 2%, —o < ST(1) < +0. 

n. 0 < |ST(0)| < (2 — SQRT(2))/2, —% < STII) < +2. 
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APPENDIA A 


COMPATIBILITY BETWEEN 
THE 80287 AND THE 8087 


The 80286/80287 operating in Real-Address mode 
will execute 8086/8087 programs without major 


modification. However, because of differences in the | 
handling of numeric exceptions by the 80287 NPX 


and the 8087 NPX, exception-handling routines may 
need to be changed. 


This appendix summarizes the differences between ~ 


the 80287 NPX and the 8087 NPX, and provides 


details showing how 8086/8087 programs can be 


ported to the 80286/80287. 


1. 


The NPX signals exceptions through a dedicated 
ERROR # line to the 80286. The NPX error signal 
does not pass through an interrupt controller (the 
8087 INT signal does). Therefore, any interrupt- 
controller-oriented instructions in numeric excep- 
tion handlers for the 8086/8087 should be delet- 
ed. 


. The 8087 instructions FENI/FNENI and FDISI/ 


FNDISI perform no useful function in the 80287. If 
the 80287 encounters one of these opcodes in its 
instruction stream, the instruction will effectively 
be ignored—none of the 80287 internal states will 
be updated. While 8086/8087 containing these 
instructions may be executed on_ the 
80286/80287, it is unlikely that the exception- 
handling routines containing these instructions 
will be completely portable to the 80287. 


. Interrupt vector 16 must point to the numeric ex- 


ception handling routine. 


. The ESC instruction address saved in the 80287 


includes any leading prefixes before the ESC op- 
code. The corresponding address saved in the 
8087 does not include leading prefixes. 


. In Protected-Address mode, the format of the 


80287’s saved instruction and address pointers is 
different than for the 8087. The instruction op- 
code is not saved in Protected mode—exception 
handlers will have to retrieve the opcode from 
memory if needed. 


. Interrupt 7 will occur in the 80286 when execuiing 


ESC instructions with either, TS (task switched) or 
EM (emulation) of the 80286 MSW set (TS = 1 or 
EM = 1). If TS is set, then a WAIT instruction will 
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also cause interrupt 7. An exception handler 


should be included in 80286/80287 code to han- 
dle these situations. 


. Interrupt 9 will occur if the second or subsequent 


words of a floating-point operand fall outside a 
segment’s size. Interrupt 13 will occur if the start- 
ing address of a numeric operand falls outside a 
segment’s size. An exception handler should be 
included in 80286/80287 code to report these. 
programming errors. 


. Except for the processor control instructions, all 
of the 80287 numeric instructions are automati- 
_ cally synchronized by the 80286 CPU—the 80286 
-automatically tests the BUSY# line from the 


80287 to ensure that the 80287 has completed its 
previous instruction before executing the next 
ESC instruction. No explicit WAIT instructions are 
required to assure this synchronization. For the 
8087 used with 8086 and 8088 processors, ex- 
plicit WAITs are required before each numeric in- 


_ struction to ensure synchronization. Although 


8086/8087 programs having explicit WAIT in-— 
structions will execute perfectly on the 
80286/80287 without reassembly, these WAIT in- 
structions are unnecessary. 


. Since the 80287 does not require WAIT instruc- 


tions before each numeric instruction, the 
ASM286 assembler does not automatically gener- 
ate these WAIT instructions. The ASM86 assem- 
bler, however, automatically precedes every ESC 
instruction with a WAIT instruction. Although nu- 
meric routines generated using the ASM86 as- 
sembler will generally execute correctly on the 
80286/80287, reassembly using ASM286 may re- 
sult in a more compact code image. 


The processor control instructions for the 80287 
may be coded using either a WAIT or No-WAIT 
form of mnemonic. The WAIT forms of these in- 
structions cause ASM286 to precede the ESC in- 
struction with a CPU WAIT instruction, in the iden- 
tical manner as does ASM86. 
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| 1. Added 20 MHz timing specs. Improved HOLD 
DATA SHEET REVISION REVIEW | times for ADS #, W/R#, RESETIN. ~ 


The following list represents the key differences be- 
tween this and the -004 versions of the 387 SX Math 
CoProcessor Data Sheet. Please review this sum- 
mary carefully. 
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| | 386™ SX SMART CACHE = 
ae  2395SX eae 


m Ontimized 286 SX Microprocessor = Concurrent Line Buffer Cacheing | 
Companion m Supports i486™ Microprocessor-like 
m Integrated SKB Data RAM | Burst | a 
m 4 Way SET Associative with Pseudo mw Dual Bus Architecture 
LRU Algorithm — — Snooping Maintains Cache 
Write Buffer Architecture Coherency 
m Integrated 4 Word Write Buffer m 20 MHz Clock 
m 8 Byte Line Size m 132 Lead PQFP Package 
m Integrated 387™ Math Coprocessor m 1K Tag Entries 
Decode Logic m Non-Sectored Architecture 


The 386 SX Smart Cache (part number 82395SX) is a low cost, single chip, 16-bit peripheral for Intel’s 386 SX 
Microprocessor. By storing frequently accessed codes or data from main memory, the 386 SX Smart Cache 
enables the 386 SX Microprocessor to run at near zero wait states. The dual bus architecture allows another 
bus master to access the System Bus while the 386 SX Microprocessor operates out of the 386 SX Smart 
Cache on the Local Bus. The 386 SX Smart Cache has a snooping mechanism which maintains cache 
coherency with main memory during these cycles. 


The 386 SX Smart Cache is completely software transparent, protecting the integrity of system software. The 
advanced architectural features of the 386 SX Smart Cache offer high performance with a cache data RAM 


size that can be integrated on a single chip, offering the board space and cost savings needed in 386 SX 
Microprocessor based systems. | 


82395SX 


z Data Bus Control 


Local Bus 
Interface ‘ 


Internal 
Control Bus 


System Bus [- 
.‘f Interface , 


CLK2 


Local Address’ Bus , 


Local Bus Cycle 
_ Definition 


4——> SA1-SA23, SBHE#, SBLE¥ 
A23~A1, BHE#, BLE¥ 


M/l0#,0/C#,W/R#, 
LOCK# 


————> SM/10#, SD/C#, SW/R#, SLOCK¥ 


———> SADS#, SRDY#, SNA# 


7 


Wire Buffer Line Buffer 


ADS#,READYI#, Internal 
READYO# Local eu Control 2 Data Bus te SKEN#, SWPH 
NPI# Local Bus Décode 4+———P SBREQ, SHOLD, SHLDA 
RESET <——-> SBRDY#, SBLAST# 
DO=D15 Local Doto Bus _ 


A20M# 


Address Mask ; 


Way 0 Way 1 Way 2 Way 3 


eat 


386™ SX Smart Cache 290396—1 
387™ SX, 386T™ SX, and i486™ are trademarks of Intel Corporation. 
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~ 82385SX 
HIGH PERFORMANCE CACHE CONTROLLER 
mw Improves 386™ SX System m Software Transparent 
Performance : 
, m Synchronous Dual Bus Architecture 
_— Reduces Average CPU Wait States to — Bus Watching Maintains Cache 
Nearly Zero Coherency 
— Zero Wait State Read Hit 
— Zero Wait State Posted Memory m Maps Full 386 SX Address Space 
Writes m Flexible Cache Mapping Policies 
— Allows Other Masters to Access the — Direct Mapped or 2-Way Set 
System Bus More Readily Associative Cache Organization 
m@ Hit Rates up to 99% | — Supports Non-Cacheable Memory 
-_ : Space 
m= Optimized as 386 SX Companion — Unified Cache for Code and Data 
- — Simple 386 SX Interface 
— Part of Intel3886™-Based Compute m Integrates Cache Directory and Cache 
Engine Including 387™ SX Math Management Logic 
Coprocessor and 82370 Integrated m High Speed CHMOS Technology 
System Peripheral _ —132-Pin PGA and 132-Lead PQFP 


— 16 MHz and 20 MHz Operation 


The 82385SX Cache Controller is a high performance peripheral for Intel’s 386™ SX Microprocessor. It stores 
a copy of frequently accessed code and data from main memory in a zero wait state local cache memory. The 
82385SX allows the 386 SX Microprocessor to run near its full potential by reducing the average number of 
CPU wait states to nearly zero. The dual bus architecture of the 82385SX allows other masters to access 
system resources while the 386 SX CPU operates locally out of its cache. In this situation, the 82385SX’s “‘bus 
watching” mechanism preserves cache coherency by monitoring the system bus address lines at no cost to 
system or local throughput. 


The 82385SxX is completely software transparent, protecting the integrity of system software. High perform- 
ance and board space savings are achieved because the 82385SX integrates a cache directory and all cache 
management logic on one chip. 


INTERNAL CONTROL BUS 


82385SX LOCAL 
BUS CONTROL 


BUS 
ARBITRATION. 


Saati Sx 
ADDRESS BUS | 


82385SX 
LOCAL BUS 
INTERFACE 


CACHE 
DIRECTORY 
SNOOP BUS 


386™ sx LOCAL | 
BUS CONTROL 


386™ sx LOCAL 
BUS DECODES 


‘CACHE 
CONTROL BUS 


PROCESSOR 
INTERFACE 


CACHE 
CONTROL 


82385SX CONFIGURATION 
| 290222-1 
82385SX Internal Block Diagram 


Intel386™, 386™, 386™ SX, and 387™ SX are trademarks of Intel Corporation. 
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1.0 82385SX FUNCTIONAL 
OVERVIEW 


ance peripheral for Intel’s 386™ SX microproces- 


sor. This chapter provides an overview of the 
82385SX, and of the basic architecture and opera- 
tion of a 386 SX CPU/82385SX system. 


1.1 82385SX Overview 


The main function of a cache memory system is to 
provide fast local storage for frequently accessed 
code and data. The cache system intercepts 386 SX 
memory references to see if the required data re- 
sides in the cache. If the data resides in the cache (a 
hit), itis returned to the 386 SX without incurring wait 


states. If the data is not cached (a miss), the refer- 


ence is forwarded to the system and the data re- 
trieved from main memory. An efficient cache will 
yield a high “hit rate” (the ratio of cache hits to total 
386 SX accesses), such that the majority of access- 
es are serviced with zero wait states. The net effect 
is that the wait states incurred in a relatively infre- 
quent miss are averaged over a large number of ac- 
cesses, resulting in an average of nearly zero wait 
states per access. Since cache hits are serviced lo- 
cally, a processor operating out of its local cache 
has a much lower “bus utilization” which reduces 
system bus bandwidth requirements, making more 
bandwidth available to other bus masters. 


~ The 82385SX Cache Controller integrates a cache 
_ directory and all cache management logic required 
to support an external 16 kbyte cache. The cache 
directory structure is such that the entire physical 
address range of the 386 SX is mapped into the 
cache. Provision is made to allow areas of memory 
to be set aside as non-cacheable. The user has two 


cache organization options: direct mapped and 2-_ 


way set associative. Both provide the high hit rates 
necessary to make a large, relatively slow main 
memory array look like fast, zero wait state memory 
to the 386 SX. 


A good hit rate is an essential ingredient of a suc- 
cessful cache implementation. Hit rate is the mea- 
sure of how efficient a cache is in maintaining a copy 
of the most frequently requested code and data. 
However, efficiency is not the only factor for per- 
formance consideration. Just as essential are sound 
cache management policies. These policies refer to 
the handling of 386 SX writes, preservation of cache 
coherency, and ease of system design. The 
82385SX’s ‘‘posted write” capability allows the ma- 
jority of 386 SX writes, including most non-cache- 
able cycles, to run with zero wait states, and the 
82385SX’s “bus watching’ mechanism preserves 
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cache coherency with no impact on system perform- 
ance. Physically, the 82385SX ties directly to the 
386 SX with virtually no external logic. | 


1.2 System Overview I: Bus Structure 


A good grasp of bus structure of a 386 SX CPU/ 
82385SX system is essential in understanding both 
the 82385SX and its role in a 386 SX system. The 
following is a description of this structure. 


1.2.1 386™ SX LOCAL BUS/82385SX LOCAL 


BUS/SYSTEM BUS 


Figure 1-1 depicts the bus structure of a typical 
386 SX system. The ‘‘386 SX Local Bus” consists of 
the physical 386 SX address, data, and control bus- 
ses. The local address and data busses are buffered 
and/or latched to become the “system” address 
and data busses. The local control bus is decoded 
by bus control logic to generate the various system 
bus read and write commands. 


The addition of an 82385SX Cache Controller caus- 
es a separation of the 386 SX bus into two distinct 
busses: the actual 386 SX local bus and the 
“82385SX Local Bus” (Figure 1-2). The 82385SX lo- 
cal bus is designed to look like the front end of a 
386 SX by providing 82385SX local bus equivalents 
to all appropriate 386 SX signals. The system ties to 
this “386 SX-like’” front end just as it would to an 
actual 386 SX. The 386 SX simply sees a fast sys- 
tem bus, and the system sees a 386 SX front end 
with low bus bandwidth requirements. The cache 
subsystem is transparent to both. Note that the 


~ 82385SX local bus is not simply a buffered version 


of the 386 SX bus, but rather is distinct from, and 
able to operate in parallel with the 386 SX bus. Oth-. 
er masters residing on either the 82385SxX local bus 
or system bus are free to manage system resources 
while the 386 SX operates out of its cache. 


1.2.2 BUS ARBITRATION 


The 82385SX presents the “386 SxX-like” interface 
which is called the 82385SX local bus. Whereas the 
386 SX provides a Hold Request/ Hold Acknowl- 
edge bus arbitration mechanism via its HOLD and 
HLDA pins, the 82385SX provides an equivalent 
mechanism via its BHOLD and BHLDA pins. (These 
signals are described in Section 3.7. ) When another 
master requests the 82385SX local bus, it issues the 
request to the 82385SX via BHOLD. Typically, at the 
end of the current 82385SX local bus cycle, the 
82385SxX will release the 82385SX local bus and ac- 
knowledge the request via BHLDA. The 386 SX is of 


‘course free to continue operating on the 386 SX lo- 


cal bus while another master owns the 82385SX lo- 
cal bus. . 
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Figure 1-1. 386™ SX System Bus Structure 
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Figure 1-2. 386™ SX and 82385SX System Bus Structure 
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1.2.3 MASTER/SLAVE OPERATION 


The above 82385SxX local bus arbitration discussion 
is true when the 82385SX is programmed for “Mas- 
ter’ mode operation. The user can, however, config- 
ure the 82385SxX for “Slave” mode operation. (Pro- 
gramming is done via a hardware strap option.) The 
roles of BHOLD and BHLDA are reversed for an 
82385SxX in slave mode; BHOLD becomes an output 
indicating a request to control the bus, and BHLDA 
becomes an input indicating that a request has been 
granted. An 82385SX programmed in slave mode 
drives the 82385SX local bus only when it has re- 
quested and subsequently been granted bus control. 
This allows multiple 386 SX CPU/82385SX subsys- 
tems to reside on the same 82385SX local bus (Fig- 
ure 1-3). 


1.2.4 CACHE COHERENCY 


Ideally, a cache contains a copy of the most heavily 
used portions of main memory. To maintain cache 
“coherency” is to make sure that this local copy is 
identical to main memory. In a system where multi- 
ple masters can access the same memory, there is 
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always a risk that one master will alter the contents 
of a memory location that is duplicated in the local 
cache of another master. (The cache is said to con- 
tain “stale’”’ data.) One rather restrictive solution is to 
not allow cache subsystems to cache shared memo- 
ry. Another simple solution is to flush the cache any- 
time another master writes to system memory. How- 
ever, this can seriously degrade system perform- 
ance as excessive cache flushing will reduce the hit 
rate of what may otherwise be a highly efficient 
cache. 


The 82385SX preserves cache coherency via “bus 
watching” (also called snooping), a technique that 
neither impacts performance nor restricts memory 
mapping. An 82385SX that is not currently bus mas- 
ter monitors system bus cycles, and when a write 
cycle by another master is detected (a snoop), the 
system address is sampled and used to see if the 
referenced location is duplicated in the cache. If so 
(a snoop hit), the corresponding cache entry is inval- 
idated, which will force the 386 SX to fetch the up- 
to-date data from main memory the next time it ac- 
cesses this modified location. Figure 1-4 depicts the 
general form of bus watching. 
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Figure 1-4. 82385SX Bus Watching—Monitor System Bus Write Cycles 
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1.3 System Overview II: 
Basic Operation 


‘ nf aratinan 
This discussion i is an overviow of the hasic operation 


of a 386 SX CPU/82385SxX system. Items discussed 
include the 82385SX’s response to all 386 SX cy- 
cles, including interrupt acknowledges, halts, and 
shutdowns. Also discussed are non- -cacheable and 
local accesses. 3 


1.3.1 386™ SX MEMORY CODE AND DATA 
READ CYCLES 


1.3.1.1 Read Hits 


When the 386 SX initiates a memory code or data 
read cycle, the 82385SX compares the high order 
bits of the 386 SX address bus with the appropriate 
addresses (tags) stored in its on-chip directory. (The 
directory structure is described in Section 2.1.1) If 
the 82385SX determines that the requested data is 
in the cache, it issues the appropriate control signals 
that direct the cache to drive the requested data 
onto the 386 SX data bus, where it is read by the 


386 SX. The 82385SX terminates the 386 SX cycle 


without inserting any wait states. 


1.3.1.2 Read Misses 


If the 82385SX determines that the requested data 
is not in the cache, the request is forwarded to the 
82385SX local bus and the data retrieved from main 
memory. As the data returns from main memory, it is 


directed to the 386 SX and also written into the . 


cache. Concurrently, the 82385SX updates the 
cache directory such that the next time this particu- 
lar piece of information is requested by the 386 SX, 
the 82385SxX will find it in the cache and return it 
with zero wait states. 


The basic unit of transfer between main memory and 
cache memory in a cache subsystem is called the 
line size. In an 82385SX system, the line size is one 
16-bit word. During a read miss, both 82385SxX local 
bus byte enables are active. This insures that the 
16-bit entry is written into the cache. (The 386 SX 
simply ignores what it did not request.) In any other 
type of 386 SX cycle that is forwarded to the 
82385SX local bus, the logic levels of the 386 SX 
byte enables are duplicated on the 82385SxX local 
bus. | 


The 82385SX does not actively fetch main memory 
data independently of the 386 SX. The 82385SxX is 
essentially a passive device which only monitors the 
address bus and activates control signals. The read 
miss is the only mechanism by which main memory 
data is copied into the cache and validated in the 
cache directory. 
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In an isolated read miss, the number of wait states 
seen by the 386 SX is that required by the system. 
memory. to respond with data plus the cache com- 
parison cvcle (hit/miss decision). The cache svstem 
must determine that the cycle is a miss before it can 
begin the system memory access. However, since 
misses most often.-occur consecutively, the 
82385SxX will begin 386 SX address pipelined cycles 


_ to effectively “hide” the comparison cycle beyond 


the first miss (refer to Section 4.1.3). 


The 82385SX can execute a memory access on the 
82385SxX local bus only if it currently owns the bus. If 
not, an 82385SX in master mode will run the cycle 
after the current master releases the bus. An 
82385SX in slave mode will issue a hold request, 
and will run the cycle as soon as the request is ac- 
knowledged. (This is true for any read or write cycle 
that needs to run on the 82385SxX local bus.) 


1.3.2 386™ SX MEMORY WRITE CYCLES 


The 82385SX’s “posted write’ capability allows the 
majority of 386 SX memory write cycles to run with 
zero wait states. The primary memory update policy 
implemented in a posted write is the traditional 
cache “write through” technique, which implies that 
main memory is always updated in any memory write 
cycle. If the referenced location also happens to re- 
side in the cache (a write hit), the cache is updated 


as well. 


Savond this, a Agicd write latches the 386 SX ad- 
dress, data, and cycle definition signals, and the 386 
SX local bus is terminated without any wait states, 
even though the corresponding 82385SX local bus 


cycle is not yet completed, or perhaps not even 


started. A posted write is possible because the 
82385SX’s bus state machine, which is almost iden- 
tical to the 386 SX bus state machine, is able to run 
82385SX local bus cycles independently of the 


_ 386 SX. The only time the 386 SX sees write cycle 


wait states is when a previously latched (posted) 
write has not yet been completed on the 82385SX 
local bus or during an I/O write (which is not post- 
ed). An 386 SX write can be posted even if the 


~ 82385SX does not currently own the 82385SxX local 


bus. In this case, an 82385SX in master mode will 
run the cycle as soon as the current master releases 


. the bus, and an 82385SX in slave mode will request 
the bus and run the cycle when the request is ac- 


knowledged. The 386 SxX is free to continue operat- 
ing out of its cache (on the 386 SX local bus) during 
this time. 


1.3.3 NON-CACHEABLE CYCLES — 


| Non-cacheable cycles fall into one of two catego- | 


ries: cycles decoded as non-cacheable, and cycles 
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that are by default non-cacheable according to the 
82385SX’s design. All non-cacheable cycles are for- 
warded to the 82385SX local bus. Non-cacheable 
cycles have no effect on the cache or cache directo- 


ry. 


The 82385SX allows the system designer to define 
areas of main memory as non-cacheable. The 
386 SX address bus is decoded and the decode out- 
put is connected to the 82385SX’s non-cacheable 
access (NCA#) input. This decoding is done in the 
first 386 SX bus state in which the non-cacheable 
cycle address becomes available. Non-cacheable 
read cycles resemble cacheable read miss cycles, 
except that the cache and cache directory are unaf- 
fected. NCA# defined non-cacheable writes, like 
most writes, are posted. 


The 82385SX defines certain cycles as non-cache- 
able without using its non-cacheable access input. 
These include I/O cycles, interrupt acknowledge cy- 
cles, and halt/shutdown cycles. I/O reads and inter- 
rupt acknowledge cycles execute as any other non- 
cacheable read. |/O write cycles are not posted. The 
386 SX is not allowed to continue until a ready signal 
is returned from the system. Halt/Shutdown cycles 
are posted. During a halt/shutdown condition, the 
82385SX local bus duplicates the behavior of the 
386 SX, including the ability to recognize and re- 
spond to a BHOLD request. (The 82385SX’s bus 
watching mechanism is functional in this condition.) 


1.3.4 386™ SX LOCAL BUS CYCLES 


386 SX Local Bus Cycles are accesses to resources 
on the 386 SX local bus other than to the 82385SX 
itself. The 82385SX simply ignores these accesses: 
they are neither forwarded to the system nor do they 
affect the cache. The designer sets aside memory 
and/or I/O space for local resources by decoding 
the 386 SX address bus and feeding the decode to 
the 82385SX’s local bus access (LBA#) input. The 
designer can also decode the 386 SX cycle defini- 
tion signals to keep specific 386 SX cycles from be- 
ing forwarded to the system. For example, a multi- 
processor design may wish to capture and remedy a 
386 SX shutdown locally without having it detected 
by the rest of the system. Note that in such a design, 
the local shutdown cycle must be terminated by lo- 
cal bus control logic. The 387 SX Math Coprocessor 
is considered a 386 SX local bus resource, but it 


need not be decoded as such by the user since the | 


82385SxX is able to internally recognize 387 SX ac- 
cesses via the M/IO# and A23 pins. 


82385SX 


1.3.5 SUMMARY OF 82385SX RESPONSE TO 
ALL 386™ SX CYCLES 


Table 1-1 summarizes the 82385SX response to all 
386 SX bus cycles, as conditioned by whether or not 
the cycle is decoded as local or non-cacheable. The 
table describes the impact of each cycle on the 
cache and on the cache directory, and whether or 
not the cycle is forwarded to the 82385SX local bus. 
Whenever the 82385SX local bus is marked “IDLE”, 
it implies that this bus is available to other masters. 


1.3.6 BUS WATCHING 


As previously discussed, the 82385SX “qualifies” a 
386 SX bus cycle in the first bus state in which the 
address and cycle definition signals of the cycle be- 
come available. The cycle is qualified as read or 
write, cacheable or non-cacheable, etc. Cacheable 
cycles are further classified as hit or miss according 
to the results of the cache comparison, which ac- 
cesses the 82385SX directory and compares the ap- 
propriate directory location (tag) to the current 
386 SX address. If the cycle turns out to be non- 
cacheable or a 386 SX local bus access, the hit/ 
miss decision is ignored. The cycle qualification re- 
quires one 386 SX state. Since the fastest 386 SX 
access is two states, the second state can be used 
for bus watching. 


When the 82385SX does not own the system bus, it 
monitors system bus cycles. If another master writes 
into main memory, the 82385SX latches the system 
address and executes a cache look-up to see if the 
altered main memory location resides in the cache. 
If so (a snoop hit), the cache entry is marked invalid 
in the cache directory. Since the directory is at most 
only being used every other state to qualify 386 SX 
accesses, snoop look-ups are interleaved between 
386 SX local bus look-ups. The cache directory is 
time multiplexed between the 386 SX address and 
the latched system address. The result is that all 
snoops are caught and serviced without slowing 
down the 386 SX, even when running zero. wait state 
hits on the 386 SX local bus. — 


1.3.7 CACHE FLUSH 


The 82385SX offers a cache flush input. When acti- 
vated, this signal causes the 82385SxX to invalidate 
all data which had previously been cached. Specifi- 
cally, all tag valid bits are cleared. (Refer to the 
82385SxX directory structure in Section 2.1.1.) There- 
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‘Table 1-1. 82385SX Response to 386™ SX Cycies 


ct - 82385SX Response | - 82385SX Response 82385SX Response when 
386 aed rel as when Decoded when Decoded Decoded as a 386SX 
as Cacheable as Non-Cacheable | _ Local Bus Access 


w[ ss, | SE [alte Ste [aml Sa 
Directory Local Bus Directory | Local Bus Directory | Local a 
ee nC 


1/0 READ 
1/0 WRITE 


MEM CODE | HIT | CACHE 
READ | READ 


es ae 
UNDEFINED| N/A| | 
eee 
== 


IDLE 


MISS | CACHE DATA MEM CODE 
WRITE | VALIDATION pea 
HALT/ | N/A 
SHUTDOWN 
MEM DATA HIT CACHE | 
_ READ READ 
MISS | CACHE ‘DATA 
WRITE | VALIDATION 
HIT | CACHE 
WRITE 


HALT/ IDLE 
SHUTDOWN 


NOTES: 

e A dash (—) indicates that the cache and cache directory are unaffected. This table does not reflect how an access affects the LRU bit. 
e An “IDLE” 82385SX Local Bus implies that this bus is available to other masters. 

e The 82385SX’s response to 387™ SX accesses is the same as when decoded as a 386 SX Local Bus Access. 

® The only other operations that affect the cache directory are: | 

1. RESET or Cache Flush—all tag valid bits cleared. 

2. Snoop Hit—corresponding line. valid bit cleared. 
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fore, the cache is empty and subsequent cycles are 
misses until the 386 SX begins repeating the new 
accesses (hits). The primary use of the FLUSH input 
is for diagnostics and multi-processor support. 


NOTE: 
The use of this pin as a coherency mechanism may 
impact software transparency. 


2.0 82385SX CACHE ORGANIZATION 


The 82385SX supports two cache organizations: a 
simple direct mapped organization and a slightly 
more complex, higher performance two way set as- 
sociative organization. The choice is made by strap- 
ping an 82385SX input (2W/D#) either high or low. 
This chapter describes the structure and operation 
of both organizations. 


2.1 Direct Mapped Cache 


2.1.1 DIRECT MAPPED CACHE STRUCTURE 
AND TERMINOLOGY 


Figure 2-1 depicts the relationship between the 
82385SX’s internal cache directory, the external 
cache memory, and the 386 SX’s physical address 
space. The 386 SX address space can conceptually 
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be thought of as cache ‘“‘pages’ each being 8K 
words (16 Kbytes) deep. The page size matches the 
cache size. The cache can be further divided into 
1024 (0 thru 1023) sets of eight words (8 x 16 bits). 
Each 16-bit word is called a “‘line’’. The unit of trans- 
fer between the main memory and cache is one line. 


Each block in the external cache has an associated 
19-bit entry in the 82385SX’s internal cache directo- 
ry. This entry has three components: a 10-bit ‘‘tag’’, 
a ‘‘tag valid” bit, and eight “‘line valid’’ bits. The tag 
acts as a main memory page number (10 tag bits 
support 210 pages). For example, if line 9 of page 2 
currently resides in the cache, then a binary 2 is 
stored in the Set 1 tag field. (For any 82385SX direct 
mapped cache page in main memory, Set 0 consists 
of lines 0-7, Set 1 consists of lines 8-15, etc. Line 9 
is shaded in Figure 2-1.) An important characteristic 
of a direct mapped cache is that line 9 of any page 
can only reside in line 9 of the cache. All identical 
page offsets map to a single cache location. 


The data in a cache set is considered valid or invalid 
depending on the status of its tag valid bit. If clear, 
the entire set is considered invalid. If true, an individ- 
ual line within the set is considered valid or invalid 
depending on the status of its line valid bit. 


The 82385SX sees the 386 SX address bus (A1- 


‘A23) as partitioned into three fields: a 10-bit ‘‘tag” 
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Figure 2-1. Direct Mapped Cache Organiztion 
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Figure 2-2. 386™ SX Address Bus Bit Fields—Direct Mapped Organization 


field (A14- _A23), a 10-bit ‘set address” field (A4— 
A138), and a 3-bit “line select’ field (A1—A3). (See 


Figure 2-2.) The lower 13 address bits (A1-A13) | 


also serve as the ‘cache address” which directly 
selects one of 8K words in the external cache. 


2.1.2 DIRECT MAPPED CACHE OPERATION 


The following is a description of the interaction be- 
tween the 386 SX, cache, and cache directory. 


2.1.2.1 Read Hits | 


When the 386 SX initiates a memory read cycle, the 
82385SX uses the 10-bit set address to select one 
of 1024 directory entries, and the 3-bit line select 
field to select one of eight line valid bits within the 
entry. The 13-bit cache address selects the corre- 
sponding word in the cache. The 82385SX com- 
pares the 10-bit tag field (A14-A23 of the 386 SX 
access) with the tag stored in the selected directory 
entry. If the tag and upper address bits match, and if 
both the tag and appropriate line valid bits are set, 
the result is a hit, and the 82385SX directs the 
‘cache to drive the selected word onto the 386 SX 
_ data bus. A read hit does not alter the contents of 
the cache or directory. 


2.1.2.2 Read Misses 


A read miss can occur in two ways. The first is 
known as a “line” miss, and occurs when the tag 
and upper address bits match and the tag valid bit.is 
set, but the line valid bit is clear. The second is 
called a “tag” miss, and occurs when either the tag 
and upper address bits do not match, or the tag valid 
bit is clear. (The line valid bit is a ‘don’t care” ina 
tag miss.) In both cases, the 82385SX forwards the 
386 SX reference to the system, and as the return- 
ing data is fed to the 386 SX, it is written into the 
cache and validated in the cache directory. 


In a line miss, the incoming data is validated simply 
by setting the previously clear line valid bit. In a tag 
miss, the upper address bits overwrite the previously 


stored tag, the tag valid bit is set, the appropriate 
line valid bit is set, and the other seven line valid bits 
are cleared. Subsequent tag hits with line misses will 
only set the appropriate line valid bit: (Any data as- 
sociated with the previous tag is no longer consid- 
ered resident in the cache.) 


2.1.2.3 Other Operations That Affect the Cache 
: and Cache Directory 


The other operations that affect the cache and/or 
directory are write hits, snoop hits, cache flushes, 
and 82385SxX resets. In a write hit, the cache is up- 
dated along with main memory, but the directory is 
unaffected. In a snoop hit, the cache is unaffected, 
but the affected line is invalidated by clearing its line 
valid bit in the directory. Both an 82385SX reset and 


_ cache flush clear all tag valid bits. 


When a 386 SX CPU/82385SX system “wakes up” 
upon reset, all tag valid bits are clear. At this point, a 
read miss is the only mechanism by which main 
memory data is copied into the cache and validated 
in the cache directory. Assume an early 386 SX 
code access seeks (for the first time) line 9 of page 
2. Since the tag valid bit.is clear, the access is a tag 
miss, and the data is fetched from main memory. 
Upon return, the data is fed to the 386 SX and simul- 
taneously written into line 9 of the cache. The set 
directory entry is updated to show this line as valid. 
Specifically, the tag and appropriate line valid bits 
are set, the remaining seven line valid bits cleared, 
and binary 2 written into the tag. Since code is se- 
quential in nature, the 386 SX will likely next want 


- line 10 of page 2, then line 11, and so on. If the 


386 SX sequentially fetches the next six lines, these 
fetches will be line misses, and as each is fetched 
from main memory and written into the cache, its 
corresponding line valid bit is set. This is the basic 


flow of events that fills the cache with valid data. 
_ Only after a piece of data has been copied into the 


cache and validated can it be accessed in a zero 
wait state read hit. Also, a cache entry must have 
been validated before it can be subsequently altered 
by a write hit, or invalidated by a snoop hit. 
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An extreme example of ‘trashing’ is if line 9 of page 
two is an instruction to jump to line 9 of page one, 
which is an instruction to jump back to line 9 of page 
two. Trashing results from the direct mapped cache 
characteristic that all identical page offsets map to a 
single cache location. In this example, the page one 
access overwrites the cached page two data, and 
the page two access overwrites the cached page 
one data. As long as the code jumps back and forth 
_ the hit rate is zero. This is of course an extreme 
case. The effect of trashing is that a direct mapped 
cache exhibits a slightly reduced overall hit rate as 
compared to a set associative cache of the same 
size. 


2.2 Two Way Set Associative Cache 


2.2.1 TWO WAY SET ASSOCIATIVE CACHE 
STRUCTURE AND TERMINOLOGY 


Figure 2-3 illustrates the relationship between the 
directory, cache, and 386 SX address space. Where- 
as the direct mapped cache is organized as one 
bank of 8K words, the two way set associative 
‘cache is organized as two banks (A and B) of 4K 
words each. The page size is halved, and the num- 
ber of pages doubled. (Note the extra tag bit.) The 
cache now has 512 sets in each bank. (Two banks 
times 512 sets gives a total of 1024. The structure 
can be thought of as two half-sized direct mapped 
caches in parallel.) The performance advantage 
over a direct mapped cache is that all identical page 
offsets map to two cache locations instead of one, 
reducing the potential for thrashing. The 82385SX’s 
partitioning of the 386 SX address bus is depicted in 
Figure 2-4. : 


2.2.2 LRU REPLACEMENT ALGORITHM 


The two way set associative directory has an addi- 


tional feature: the “least recently used” or LRU bit. 


In the event of a read miss, either bank A or bank B 
will be updated with new data. The LRU bit flags the 
candidate for replacement. Statistically, of two 
blocks of data, the block most recently used is the 
block most likely to be needed again in the near 
future. By flagging the least recently used block, the 
82385SX ensures that the cache block replaced is 
the least likely to have data needed by the CPU. 


2.2.3 TWO WAY SET ASSOCIATIVE CACHE 
OPERATION 


2.2.3.1 Read Hits 


When the 386 SX initiates a memory read cycle, the 
82385SX uses the 9-bit set address to select one of 


82385SX 


512 sets. The two tags of this set are simultaneously 
compared with A13-A23, both tag valid bits 
checked, and both appropriate line valid bits 
checked. If either comparison produces a hit, the 
corresponding cache bank is directed to drive the 
selected word onto the 386 SX data bus. (Note that 
both banks will never concurrently cache the same 
main memory location.) If the requested data resides 
in bank A, the LRU bit is pointed toward B. If B pro- 
duces the hit, the LRU bit is pointed toward A. 


2.2.3.2 Read Misses 


As in direct mapped operation, a read miss can be 
either a line or tag miss. Let’s start with a tag miss 
example. Assume the 386 SX seeks line 9 of page 2, 
and that neither the A or B directory produces a tag 
match. Assume also, as indicated in Figure 2-3, that 
the LRU bit points to A. As the data returns from 
main memory, it is loaded into offset 9 of bank A. 
Concurrently, this data is validated by updating the 
set 1 directory entry for bank A. Specifically, the up- 
per address bits overwite the previous tag, the tag 
valid bit is set, the appropriate line valid bit is set, 
and the other seven line valid bits cleared. Since this 
data is the most recently used, the LRU bit is turned 
toward B. No change to bank B occurs. 


If the next 386 SX request is line 10 of page two, the 
result will be a line miss. As the data returns from 
main memory, it will be written into offset 10 of bank — 
A (tag hit/line miss in bank A), and the appropriate 
line valid bit will be set. A line miss in one bank will 
cause the LRU bit to point to the other bank. In this 
example, however, the LRU bit has already been 
turned toward B. 


2.2.3.3 Other Operations That Affect the Cache 
and Cache Directory 


Other operations that affect the cache and cache 
directory are write hits, snoop hits, cache flushes, 
and 82385SX resets. A write hit updates the cache 
along with main memory. If directory A detects the 
hit, bank A is updated. If directory B detects the hit, 
bank B is updated. If one bank is updated, the LRU — 
bit is pointed towards the other. 


If a snoop hit invalidates an entry, for example, in 
cache bank A, the corresponding LRU bit is pointed 
toward A. This insures that invalid data is the prime 
candidate for replacement in a read miss. Finally, 
resets and flushes behave just as they do in a direct 
mapped cache, clearing all tag valid bits. 


3.0 82385SX PIN DESCRIPTION 
The 82385SX creates the 82385SX local bus, which 


_ isa functional 386 SX interface. To facilitate under- 
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Figure 2-4. 386™ SX Address Bus Bit Fields—Two-Way Set Associative Organization 


standing, 82385SX iocal bus signals go by the same | 


name as their 386 SX equivalents, except that they 
are preceded by the letter “B”. The 82385SxX local 
bus equivalent to ADS# is BADS#, the equivalent 
to NA# is BNA#, etc. This convention applies to 
bus states as well. For example, BT1P is the 
82385SX local bus state oe to the 386 SX 
T1P state. 


3.1 386™ SX CPU/82385SX Interface 
Signals 


These signals form the direct interface between the 
386 SX and the 82385SX. 

3.1.1 386™ SX CPU/82385SX Clock (CLK2) 
CLK2 provides the fundamental timing for a 386 SX 


CPU/82385SX system, and is driven by the same 
source that drives the 386 SX CLK2 input. The 


T=STATE 


82385SX, like the 386 SX, divides CLK2 by two to 
generate an internal “phase indication” clock. (See 
Figure 3-1.) The CLK2 period whose rising edge 
drives the internal clock low is called PHI1, and the — 
CLK2 period that drives the internal clock high is: 
called PHI2. A PHI1—PHI2 combination (in that or- 
der) is known as a “T”’ state, pang is the basis for 
386 SX bus cycles. - 


3.1.2 386™ SX CPU/82385SX RESET (RESET) 


This input resets the 82385SxX, bringing it to an initial 
known state, and is driven by the same source that 
drives the 386 SX RESET input. A reset effectively 
flushes the cache by clearing all cache directory tag 
valid bits. The falling edge of RESET is synchronized 
to CLK2, and used by the 82385SX to properly es- 
tablish the phase of its internal clock. (See Figure 
3-2.) Specifically, the second internal phase follow- 
ing the falling edge of RESET is PHI2. 


T-STATE 
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Figure 3-1. CLK2 and Internal Clock 
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Figure 3-2. neset! internal Phase Relationship © 


3.1.3 386™ SX CPU/82385SX ADDRESS BUS. 
(A1-A23), BYTE ENABLES (BHE#, BLE #), 
AND CYCLE DEFINITION SIGNALS 
(M/lIO#, D/C#, W/R#, LOCK#) 


The 82385Sx directly connects to these 386 SX out- 


puts. The 386 SX address bus is used in the cache 
directory comparison to see if data referenced by 
386 SX resides in the cache, and the byte enables 
inform the 82385SX as to which portions of the data 
bus are involved in a 386 SX cycle. The cycle defini- 
tion signals are decoded by the 82385SxX to deter- 
mine the type of cycle the 386 SX is executing. 


3.1.4 386™ SX CPU/82385SX ADDRESS 
STATUS (ADS#) AND READY INPUT 
~ (READYI#) 


ADS #, a 386 SX output, tells the 82385SX that new 
address and cycle definition information is available. 
READYI#, an input to both the 386 SX (via the 
386 SX READY # input pin) and 82385SxX, indicates 
the completion of a 386 SX bus cycle. ADS# and 
READYI# are used to track the 386 SX bus state. 


3.1.5 386™ SX NEXT ADDRESS REQUEST | 
(NA#) — 


This 82385SX output controls 386 SX pipelining. It 
can be tied directly to the 386 SX NA# input, or it 


can be logically “AND”ed with other 386 SX local 


bus next address requests. 


3.1.6 READY OUTPUT (READYO#) AND BUS 
READY ENABLE (BRDYEN#) 


The 82385SxX directly terminates all but two types of 
386 SX bus cycles with its READYO# output. 
386 SX local bus cycles must be terminated by the 
local device being accessed. This includes devices 
decoded using the 82385SX LBA# signal and 387 
accesses. The other cycles not directly terminated 
by the 82385SX are 82385SX local bus reads, spe- 


cifically cache read misses and ‘non-cacheable 
reads. (Recall that the 82385SX forwards and runs 
such cycles on the 82385SX bus.) In these cycles 
the signal that terminates the 82385SxX local bus ac- 
cess is BREADY#. which is gated through to the 
386 SX local bus such that the 386 SX and 82385SX 
local bus cycles are concurrently terminated. 
BRDYEN # is used to gate the BREADY # signal to 
the 386 SX. i 


3.2 Cache Control Signals © 


These 82385SX outputs control the external 16 KB 
cache data memory. 


3.2.1 CACHE ADDRESS LATCH ENABLE 
(CALEN) 


This signal controls the latch (typically an F or AS 
series 74373) that resides between the low order 
386 SX address bits and the cache SRAM address 


inputs. (The outputs of this latch are the ‘‘cache ad- 


dress” described in the previous chapter.) When 
CALEN is high the latch is transparent. The falling 
edge of CALEN latches the current inputs which re- 
main applied to the cache data memory until CALEN 
returns to an active high state. 


3.2.2 CACHE TRANSMIT/RECEIVE (CT/R#) 


This signal defines the direction of an optional data 
transceiver (typically an F or AS series 74245) be- 
tween the cache and 386 SX data bus. When high, 
the transceiver is pointed towards the 386 SX local 
data bus (the SRAMs are output enabled). When 
low, the transceiver points towards the cache data 
memory. A transceiver is required if the cache is de- 
signed with SRAMs that lack an output enable con- 
trol. A transceiver may also be desirable in a system 
that has a heavily loaded 386 SX local data bus. 
These devices are not necessary when using 
SRAMs which incorporate an output enable. 
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3.2.3 CACHE CHIP SELECTS (CS0#, CS1#) 


These active low signals tie to the cache SRAM chip 
selects, and individually enable both bytes of the 16- 
bit wide cache. CSO# enables DO-D7 and CS1# 
enables D8-D15. During read hits, both bytes are 
enabled regardiess of whether or not the 386 SX 
byte enables are active. (The 386 SX ignores what it 
did not request.) Also, both cache bytes are enabled 
in a read miss so as to update the cache with a 
complete line (word). In a write hit, only the cache 
bytes that correspond to active byte enables are se- 
lected. This prevents cache data from being corrupt- 
ed in. a partial word write. 


3.2.4 CACHE OUTPUT ENABLES 
(COEA#, COEB#) 
AND WRITE ENABLES 
(CWEA#, CWEB#) 


COEA# and COEB# are active low signals which 
tie to the cache SRAM or Transceiver output en- 
ables and respectively enable cache bank A or B. 
The state of DEFOE # (define cache output enable), 
an 82385SX configuration input, determines the 
functional definition of COEA# and COEB#. 


lf DEFOE# = Vi, in a two-way set associative 
cache, either COEA# or COEB# is active during 
read hit cycles only, depending on which bank is 
selected. In a direct mapped cache, both are activat- 
ed during read hits, so the designer is free to use 
either one. This COEx# definition best suits cache 
SRAMs with output enables. 


lf DEFOE# = Vin, COEx# is active during a read 
hit, read miss (cache update) and write hit cycles 
only. This COEx# definition best suits cache 
SRAMs without output enables. In such systems, 
transceivers are needed and their output enables 
must be active for writing, as well as reading, the 
~cache SRAMs. 


CWEA# and CWEB¥#¥ are active low signals which 
tie to the cache SRAM write enables, and respec- 
tively enable cache bank A or B to receive data from 
the 386 SX data bus (386 SX write hit or read miss 
update). In a two-way set associative cache, one or 
the other is enabled in a read miss or write hit. In a 
direct mapped cache, both are activated, so the de- 
signer is free to use either one. 


The various cache configurations supported by the 
_ 82385SX are described in Section 4.2.1. 


3.3 3867™ SX Local Bus Decode Inputs 


These 82385SX inputs are generated by decoding 
the 386 SX address and cycle definition lines. These 
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active low inputs are sampled at the end of the first 
state in which the address of a new 386 SX cycle 
becomes available. (T1 or first T2P.) 


3.3.1 386™ SX LOCAL BUS ACCESS (LBA#) 


This input identifies a 386 SX access as directed to 
a resource (other than the cache) on the 386 SX 
local bus. (The 387 SX Math Coprocessor is consid- 
ered a 386 SX local bus resource, but LBA# need 
not be generated as the 82385SxX internally decodes 
387 SX accesses.) The 82385SX simply ignores 
these cycles. They are neither forwarded to the sys- 
tem nor do they affect the cache or cache directory. - 
Note that LBA# has priority over all other types of 
cycles. If LBA# is asserted, the cycle is interpreted 
as a 386 SX local bus access, regardless of the cy- 
cle type or status of NCA#. This allows any 386 SX 
cycle (memory, !/O, interrupt acknowledge, etc.) to 
be kept on the 386 SX local bus if desired. 


3.3.2 NON-CACHEABLE ACCESS (NCA #) 


This active low input identifies a 386 SX cycle as 
non-cacheable. The 82385SX forwards non-cache- 
able cycles to the 82385SX local bus and runs them. 
The cache and cache directory are unaffected. 


NCA# allows a designer to set aside a portion of 
main memory as non-cacheablie. Potential applica- 
tions include memory-mapped !/O and systems 
where multiple masters access dual ported memory 
via different busses. Another possibility makes use 
of the 386 SX D/C# output. The 82385SX by de- 
fault implements a unified code and data cache, but 
driving NCA# directly by D/C# creates a data only 
cache. If D/C # is inverted first, the result is a code 
only cache. 


3.4 82385SX Local Bus Interface 
Signals 


The 82385SX presents an “386 SX-like” front end to 
the system, and the signals discussed in this section 
are 82385SxX local bus equivalents to actual 386 SX 
signals. These signals are named with respect to 
their 386 SX counterparts, but with the letter “B” 
appended to the front. 


Note that the 82385SX itself does not have equiva- 
lent output signals to the 386 SX data bus (DO-—D15) 
address bus (A1-—A23), and cycle definition signals 
(M/IO#, D/C#, W/R#). The 82385SX data bus 
(BDO-—BD15) is actually the system side of a latching 
transceiver, and the 82385SX address bus and cycle 
definition signals (BA1-BA23, BM/IO#, BD/C#, 
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BW/R #) are the outputs of an edge-triggered latch. 
The signals that control this data transceiver and ad- 
dress latch are discussed in Section 3.5. 


3.4.1 82385SX BUS BYTE ENABLES 
| (BBHE #, BBLE #) 


BBHE# and BBLE# are the 82385SX local bus 
equivalents to the 386 SX byte enables. In a cache 
read miss, the 82385SX drives both signals low, re- 
gardless of whether or not the 386 SX byte enables 
are active. This insures that a complete line (word) is 
fetched from main memory for the cache update. In 
all other 82385SxX local bus cycles, the 82385SX du- 
plicates the logic levels of the 386 SX byte enables. 
The 82385SxX< tri-states these outputs when it is not 
the current bus master. 


3.4.2 82385SX BUS LOCK (BLOCK #) 


BLOCK # is the 82385SxX local bus equivalent to the 
386 SX LOCK# output, and distinguishes between 
locked and unlocked cycles. When the 386 SX runs 
a locked sequence of cycles (and LBA # is negated), 


the 82385SX forwards and runs the sequence on 


the 82385SX local bus, regardless of whether any 
locations referenced in the sequence reside in the 
cache. A read hit will be run as if it is a read miss, but 
a write hit will update the cache as well as being 
completed to system memory. In keeping with 
386 SX behavior, the 82385SX does not allow an- 
other master to interrupt the sequence. BLOCK # is 
tri-stated when the 82385SX is not the current bus 
‘master. | 


3.4.3 82385SX BUS ADDRESS STATUS | 
(BADS 4) : 


BADS# is the 82385SSX local bus equivalent of 
ADS#, and indicates that a valid address (BA1- 
BA23, BBHE#, BBLE#) and cycle definition (BM/ 
lO#, BW/R#, BD/C#) are available. It is asserted 
in BT1 and BT2P states, and is tri-stated when the 
82385SX does not own the bus. : 


3.4.4 82385SX BUS READY INPUT (BREADY#) - 


82385SX local bus cycles are terminated by 
~BREADY #, just as 386 SX cycles are terminated by 
the 386 SX READY # input. In 82385SX local bus 
read cycles, BREADY # is gated by BRDYEN# onto 
the 386 SX local bus, such that it terminates both 
the 386 SX and 82385SxX local bus cycles. 


3.4.5 82385SX BUS NEXT ADDRESS REQUEST 
(BNA #) 


BNA# is the 82385SX local bus equivalent to the 
386 SX NA# input, and indicates that the system is 


prepared to accept a pipelined address and cycle 
definition. If BNA# is asserted and the new cycle 
information is available, the 82385SX begins a pipe- 
lined cycle on the 82385SxX local bus. 


3.5 82385SX Bus Data Transceiver and 


_ Address Latch Control Signals 


The 82385SX data bus is the system side of a latch- 
ing transceiver (typically for F or AS series 74646), 
and the 82385SX address bus and cycle definition 
signals are the outputs of an edge-triggered latch (F 
or AS series 74374). The following is a discussion of 
the 82385SX outputs that control these devices. An 
important characteristic of these signals and the de- 
vices they control is that they ensure that BDO- 
BD1i5, BAi-BA23, BM/lIO#, BD/C# and BW/R# 
reproduce the functionality and timing behavior of 
their 386 SX equivalents. 


3.5.1. LOCAL DATA STROBE (LDSTB), DATA 
OUTPUT ENABLE (DOE #), AND BUS 
TRANSMIT/RECEIVE (BT/R#) 


These signals contro! the latching data transceiver. 
BT/R# defines the transceiver direction. When 
high, the transceiver drives the 82385SX data bus in 
write cycles. When low, the transceiver drives the 
386 SX data bus in 82385SX local bus read cycles. 
DOE # enauies the transceiver outputs. 


The rising edge of LDSTB latches the 386 SX data 
bus in all write cycles. The interaction of this signal 
and the latching transceiver is used to perform the 
82385SX’s posted write capability. 


3.5.2 BUS ADDRESS CLOCK PULSE (BACP) 
AND BUS ADDRESS OUTPUT ENABLE 
(BAOE #) 


These signals control the latch that drives BA1- 
BA23, BM/IO#, BW/R#, and BD/C#. In any 
386 SX cycle that is forwarded to the 82385SxX local 
bus, the rising edge of BACP latches the 386 SX 
address and cycle definition signals. BAOE# en- 
ables the latch outputs when the 82385SX is the 
current bus master and disables them otherwise. 


3.6 Status and Control Signals | 


3.6.1 CACHE MISS INDICATION (MISS #) 


This output accompanies cacheable read and write 
miss cycles. This signal transitions to its active low 
state when the 82385SX determines that a cache- 
able 386 SX access is a miss. Its timing behavior 
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follows that of the 82385SX local bus cycle defini- 
tion signals (BM/IO#, BD/C#, BW/R#) so that it 
becomes available with BADS# in BT1 or the first 
BT2P. MISS# is floated when the 82385SX does 
not own the bus, such that multiple 82385SX’s can 
share the same node in multi-cache systems. (As 
discussed in Chapter 7, this signal also serves a re- 
served function in testing the 82385SX.) 


3.6.2 WRITE BUFFER STATUS (WBS) 


The latching data transceiver is also known as the 
“posted write buffer’. WBS indicates that this buffer 
contains data that has not yet been written to the 
system even though the 386 SX may have begun its 
next cycle. It is activated when 386 SX data is 
latched, and deactivated when the corresponding 
82385SX local bus write cycle is completed 
(BREADY #). (As discussed in Chapter 7, this signal 
also serves a reserved function in testing the 
82385SX.) 


WBS can serve several functions. In multi-processor 
applications, it can act as a coherency mechanism 
by informing a bus arbiter that it should let a write 
cycle run on the system bus so. that main memory 
has the latest data. If any other 82385SX cache sub- 
systems are on the bus, they will monitor the cycle 
via their bus watching mechanisms. Any 82385SX 
that detects a snoop hit will invalidate the corre- 
sponding entry in its local cache. 


3.6.3 CACHE FLUSH (FLUSH) 


When activated, this signal causes the 82385SX to 
clear all of its directory tag valid bits, effectively 
flushing the cache. (As discussed in Chapter 7, this 
signal also serves a reserved function in testing the 
82385SX.) The primary use of the FLUSH input is for 
diagnostics and multi-processor support. The use of 
this pin as a coherency mechanism may impact soft- 
ware transparency. 


The FLUSH input must be held active for at least 4 
CLK (8 CLK2) cycles to complete the flush se- 
quence. If FLUSH is still active after 4 CLK cycles, 
any accesses to the cache will be misses and the 
cache will not be updated (since FLUSH is active). 


3.7 Bus Arbitration Signals 
(BHOLD and BHLDA) 


In master mode, BHOLD is an input that indicates a 
request by a slave device for bus ownership. The 
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82385SX acknowledges this request via its BHLDA 
output. (These signals function identically to the 
386 SX HOLD and HLDA signals.) 


The roles of BHOLD and BHLDA are reversed for an 
82385SX in slave mode. BHOLD is now an output 
indicating a request for bus ownership, and BHLDA 
an input indicating that the request has been grant- 
ed. 


3.8 Coherency (Bus Watching) 
Support Signals 
(SA1-SA23, SSTB #, SEN) 


These signals form the 82385SX’s bus watching in- 
terface. The Snoop Address Bus (SA1-—SA23) con- 
nects to the system address lines if masters reside 
at both the system and 82385SX local bus levels, or 
the 82385SX local bus address lines if masters re- 
side only at the 82385SX local bus level. Snoop 
Strobe (SSTB #) indicates that a valid address is on 
the snoop address inputs. Snoop Enable (SEN) indi- 
cates that the cycle is a write. In a system with mas- 
ters only at the 82385SxX local bus level, SA1—SA23, 
SSTB#, and SEN can be driven respectively by 
BA1-BA23, BADS#, and BW/R# without any sup- 
port circuitry. 


3.9 Configuration Inputs — 
(2W/D#, M/S#, DEFOE#) © 


These signals select the configurations supported 
by the 82385SX. They are hardware strap options 
and must not be changed dynamically. 2W/D# (2- 
Way/Direct Mapped Select) selects a two-way set 
associative cache when tied high, or a direct 
mapped cache when tied low. M/S# (Master/Slave 
Select) chooses between master mode (M/S # high) 
and slave mode (M/S# low). DEFOE# defines the: 
functionality of the 82385SX cache output enables 
(COEA# and COEB#). DEFOE# allows the 
82385SxX to interface to SRAMs with output enables 
(DEFOE # low) or to SRAMs requiring transceivers 
(DEFOE # high). 


3.10 Reserved Pins (RES) 


Some pins on the 82385SX are reserved for internal 
testing and future cache features. To assure com- 
patibility and functionality, these reserved pins must 
be configured as shown in Table 3.10.1. | 
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Table 3.10.1. Reserved Pin Connections 


PGA | PQFP ee ic Level 
| Pin Location Pin Location |. 9 


High 
High 
High | 
High 
High 
High 
High 
— High 
High 
High 
High 
High 
High - 
High 
High 
High 
High 
No Connect 
High 
No Connect 
No Connect 
No Connect 


4.0 386 SX LOCAL BUS INTERFACE 


The following is a detailed description of how the 
82385SX interfaces to the 386 SX and to 386 SX 
local bus resources. Items specifically addressed 
are the interfaces to the 386 SX, the cache SRAMs, 
and the 387 SX Math Coprocessor. 


The many timing diagrams in this and the next chap- 
ter provide insight into the dual pipelined bus struc- 
ture of a 386 SX CPU/82385SX system. It’s impor- 
tant to realize, however, that one need not know 


every possible cycle combination to use the — 


82385SX. The interface is simple, and the dual bus 
operation invisible to the 386 SX and system. To 
facilitate discussion of the timing diagrams, several 

_conventions have been adopted. Refer to Figure 
4-2A, and note that 386 SX bus cycles, 386 SX bus 
states, and 82385SX bus states are identified along 
the top. All states can be identified by the “frame 
numbers” along the bottom. The cycles in Figure 
4-2A include a cache read hit (CRDH), a cache read 
miss (CRDM), and a write (WT). WT represents any 
write, cacheable or not. When necessary to distin- 
guish cacheable writes, a write hit goes by CWTH 
and a write miss by CWTM. Non-cacheable system 
reads go by SBRD. Also, it is assumed that system 
bus pipelining occurs even though the BNA# signal 
is not shown. When the system pipeline begins is a 
function of the system bus controller. 
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386 SX bus cycles can be tracked by ADS# and 
READYI#, and 82385SX cycles by BADS# and 
BREADY #. These four signals are thus a natural 
choice to help track parallel bus activity. Note in the 
timing diagrams that 386 SX cycles are numbered 
using ADS# and READYI#, and 82385SX cycles 
using BADS# and BREADY#. For example, when 
the address of the first 386 SX cycle becomes avail- 


able, the corresponding assertion of ADS# is 


marked “1”, and the READYI# pulse that termi- 
nates the cycle is marked “1” as well. Whenever a 
386 SX cycle is forwarded to the system, its number 
is forwarded as well so that the corresponding 
82385SX bus ore can be tracked by BADS# and 
BREADY #. 


The “‘N” value in the timing diagrams is the assumed 
number of main memory wait states inserted in a 
non-pipelined 82385SX bus cycle. For example, a 
non-pipelined access to N= 2 memory requires a to- 
tal of four bus states, while a pipelined access re- 
quires three. (The pipeline advantage effectively hid- 
es one main memory wait state.) 


4.1 Processor Interface 


This section presents the 386 SX CPU/82385SX 
hardware interface and discusses the interaction 
and timing of this interface. Also addressed is how to 
decode the 386 SX address bus to generate the 
82385SX inputs LBA# and NCA#. (Recall that 
LBA# allows memory and/or I/O space to be set 
aside for 386 SX local bus resources; and NCA# 
allows system memory to ne: set aside as non- 
eeu 


4.1.1 HARDWARE INTERFACE 


Figure 4-1 is a diagram of a 386 SX CPU/82385SX 
system, which can be thought of as three distinct 
interfaces. The first is the 386 SX CPU/82385SxX< in- 
terface (including the Ready Logic). The second is 
the cache interface, as depicted by the cache con- 
trol bus in the upper left corner of Figure 4-1. The 
third is the 82385SX bus interface, which includes 
both direct connects and signals that control the 
74374 address/cycle definition latch and 74646 
latching data transceiver. (The 82385SX bus inter- 
face is the subject of the next chapter.) 


As seen in Figure 4-1, the 386 SX CPU/82385SX 
interface is a straightforward connection. The only 
necessary support logic is that required to sum all 
ready sources. 
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4.1.2 READY GENERATION 


Note in Figure 4-1 that the ready logic consists of 
two gates. The upper three-input AND gate (shown 
as a negative logic OR) sums all 386 SX local bus 
ready sources. One such source is the 82385SX 
READYO# output, which terminates read hits and 
posted writes. The output of this gate drives the 
386 SX READY# input and is monitored by the 
82385SX (via READYI#) to track the 386 SX bus 
state. | 


When the 82385SX forwards a 386 SX read cycle to 
. the 82385SX bus (cache read miss or non-cache- 
able read), it does not directly terminate the cycle via 
READYO#. Instead, the 386 SX and 82385SX bus 
cycles are concurrently terminated by a system 
ready source. This is the purpose of the additional 
two-input OR gate (negative logic AND) in Figure 
4-1. When the 82385SX forwards a read to the 
82385SX bus, it asserts BRDYEN# which enables 
the system ready signal (BREADY #) to directly ter- 
minate the 386 SX bus cycle. 


Figure 4-2A and 4-2B illustrate the behavior of the 
signals involved in ready generation. Note in cycle 1 
of Figure 4-2A that the 82385SX READYO # directly 
terminates the hit cycle. In cycle 2, READYO# is not 
activated. Instead the 82385SX BRDYEN# is acti- 
vated in BT2, BT2P, or BT2I states such that 
BREADY# can concurrently terminate the 386 SX 
and 82385SX bus cycles (frame 6). Cycle 3 is a post- 
ed write. The write data becomes available in T1P 
(frame 7), and the address, data, and cycle definition 
of the write are latched in T2 (frame 8). The 386 SX 
cycle is terminated by READYO # in frame 8 with no 
wait states. The 82385SX, however, sees the write 
cycle through to completion on the 82385SX bus 
~ where it is terminated in frame 10 by BREADY #. In 
this case, the BREADY # signal is not gated through 
to the 386 SX. Refer to Figures 4-2A and 4-2B for 
clarification. | 


4.1.3 NA# AND 386 SX LOCAL BUS 
PIPELINING 


Cycle 1 of Figure 4-2A is a typical cache read hit. 
The 386 SX address becomes available in T1, and 
the 82385SX uses this address to determine if the 
referenced data resides in the cache. The cache 
look-up is completed and the cycle qualified as a hit 
or miss in T1. If the data resides in the cache, the 
cache is directed to drive the 386 SX data bus, and 
the 82385SX drives its READYO# output so the cy- 
cle can be terminated at the end of the first T2 with 
no wait states. 


Although cycle 2 starts out like cycle 1, at the end of 
T1 (frame 3), it is qualified as a miss and forwarded 
to the 82385SX bus. The 82385SX bus cycle begins 


one state after the 386 SX bus cycle, implying a one 
wait state overhead associated with cycle 2 due to 
the look-up. When the 82385SX encounters the 
miss, it immediately asserts NA#, which puts the 
386 SX into pipelined mode. Once in pipelined 
mode, the 82385SxX is able to qualify a 386 SX cycle 
using the 386 SX pipelined address and control sig- 
nals. The result is that the cache look-up state is 
hidden in all but the first of a contiguous sequence 
of read misses. This is shown in the first two cycles, 
both read misses, of Figure 4-2B. The CPU sees the 


look-up state in the first cycle, but not in the second. 
‘In fact, the second miss requires a total of only two 


states, as not only does 386 SX pipelining hide the — 


— look-up state, but system pipelining hides one of the 


main memory wait states. (System level pipelining 
via BNA # is discussed in the next chapter.) Several 
characteristics of the 82385SX’s pipelining of the 
386 SX are as follows: 


— The above discussion applies to all system 
reads, not just cache read misses. 


— The 82385SX provides the fastest possible 
switch to pipelining, T1-T2-T2P. The exception to 
this is when a system read follows a posted 
write. In this case, the sequence is T1-T2-T2- 
T2P. (Refer to cycle 4 of Figure 4-2A.) The num- 
ber of T2 states is dependent on the number of 
main memory wait states. 


— Refer to the read hit in Figure 4-2A (cycle 1), and 
note that NA# is actually asserted before the 
end of T1, before the hit/miss decision is made. 
This is of no consequence since even though 
NA# is sampled active in T2, the activation of 
READYO# in the same T2 renders NA# a 
“don’t care”. NA# is asserted in this manner to 
meet 386 SX timing requirements and to insure 
the fastest possible switch to pipelined mode. 


— All read hits and the majority of writes can be 
serviced by the 82385SxX with zero wait states in 
non-pipelined mode, and the 82385SX accord- 
ingly attempts to run all such cycles in non-pipe- 
lined mode. An exception is seen in the hit cycles 
(cycles 3 and 4) of Figure 4-2B. The 82385SX 

does not know soon enough that cycle 3 is a hit, 

and thus sustains the pipeline. The result is that 
three sequential hits are required before the 
386 SX is totally out of pipelined mode. (The 
three hits look like T1P-T2P, T1P-T2, T1-T2.) 
Note that this does not occur if the number of 
main memory wait states is equal to or greater 
than two. 


As far as the design is concerned, NA# is generally 
tied directly to the 386 SX NA# input. However, oth- 
er local NA# sources may be logically ‘““AND”ed > 
with the 82385SX NA# output if desired. It is essen- 

tial, however, that no device other than the 82385SX 
drive the 386 SX NA# input unless that device re- 
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sides on the 386 SX local bus in space decoded via 
LBA#. If desired, the 82385SX NA# output can be 
ignored and the 386 SX NA# input tied high. The 
386 SX NA# input should never be tied low, which 
would always keep it active. 


4.1.4 LBA# AND NCA# GENERATION 


The 82385SX inputs signals LBA# and NCA# are — 


generated by decoding the 386 SX address (A1- 
A23) and cycle definition (W/R#, D/C#, M/IO#) 
lines. The 82385SX samples them at the end of the 
first state in which they become available, which is 
either T1 or the first T2P cycle. The decode configu- 
ration and timings are illustrated respectively in Fig- 
ures 4-3A and 4-3B. 


4.2 Cache Interface 


The following is a description of the external data 
cache and 82385SxX cache interface. 
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4.2.1 CACHE CONFIGURATIONS 


The 82385SX controls the cache memory via the 
control signals shown in Figure 4-1. These signals 
drive one of four possible cache configurations, as 
depicted in Figures 4-4A through 4-4D. Figure 4-4A 


- shows a direct mapped cache organized as 8K 


words. The likely design choice is two 8K x 8 
SRAMs. Figure 4-4B depicts the same cache memo- 
ry but with a data transceiver between the cache 
and 386 SX data bus. In this configuration, CT/R# 
controls the transceiver direction, COEA# drives the 
transceiver output enable (COEB# could also be 
used), and DEFOE # is strapped high. A data buffer 
is required if the chosen SRAM does not have a 
separate output enable. Additionally, buffers may be 
used to ease SRAM timing requirements or in a sys- 
tem with a heavily loaded data bus. (Guidelines for 
SRAM selection are included in Chapter 6.) 


Figure 4-4C depicts a two-way set associative cache 
organized as two banks (A and B) of 4K words each. 
The likely design choice is eight 4K x 4 SRAMs. Fi- 
nally, Figure 4-4D depicts the two-way organization 
with data buffers between the cache memory and 


- data bus. 
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Figure 4-3. NCA#, LBA# Generation 
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Figure 4-4B. Direct Mapped Cache with Data Buffers 
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Figure 4-4C. Two-Way Set Associative Cache without Data Buffers 
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Figure 4-4D. Two-Way Set Associative Cache with Data Buffers 


4.2.2 CACHE CONTROL ... DIRECT MAPPED 


Figure 4-5A illustrates the timing of cache read and 
write hits, while Figure 4-5B illustrates cache up- 
dates. In a read hit, the cache output enables are 
driven from the beginning of T2 (cycle 1 of Figure 
4-5A). If at the end of T1 the cycle is qualified as a 
cacheable read, the output enables are asserted on 
the assumption that the cycle will be a hit. (Driving 
the output enables before the actual hit/miss deci- 
sion is made eases SRAM timing requirements.) 


Cycle 1 of Figure 4-5B illustrates what happens 


when the assumption of a hit turns out to be wrong. 
Note that the output enables are asserted at the be- 
ginning of of T2, but then disabled at the end of T2. 
Once the output enables are inactive, the 82385SX 
turns the transceiver around (via CT/R #) and drives 
the write enables to begin the cache update cycle. 
Note in Figure 4-5B that once the 386 SX is in pipe- 
lined mode, the output enables need not be driven 
prior to a hit/miss decision, since the decision is 
made earlier via the pipelined address information. 
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One consequence of driving the output enables low 
in a miss before the hit/miss decision is made is that 
since the cache starts driving the 386 SX data bus, 
the 82385SX cannot enable the 74646 transceiver 
(Figure 4-1) until after the cache outputs are dis- 
abled. (The timing of the 74646 control signals is 
described in the next chapter.) The result is that the 
74646 cannot be enabled soon enough to support 
N=0 main memory (“N” was defined in Section 4.0 
as the number of non-pipelined main memory wait 
states). This means that memory which can run with 
zero wait states in a non-pipelined cycle should not 
be mapped into cacheable memory. This should not 
present a problem, however, as a main memory sys- 
tem built with N=0 memory has no need of a cache. 
(The main memory is as fast as the cache.) Zero 
wait state memory can be supported if it is decoded 
as non-cacheable. The 82385SX knows that a cycle 
is non-cacheable in time not to drive the cache out- 
put enables, and can thus enable the 74646 sooner. 


In a write hit, the 82385SX only updates the cache 
bytes that are meant to be updated as directed by 
the 386 SX byte enables. This prevents corrupting 
cache data in partial doubleword writes. Note in Fig- 
ure 4-5A that the appropriate bytes are selected via 
the cache byte select lines CSO# and CS1#. Ina 
read hit, both select lines are driven as the 386 SX 
will simply ignore data it does not need. Also, in a 
cache update (read miss), both selects are active in 
order to update the cache with a complete lin 

(word). 


4.2.3 CACHE CONTROL ... 
TWO-WAY SET ASSOCIATIVE 


‘Figures 4-6A and 4-6B illustrate the timing of cache 
read hits, write hits, and updates for a two-way set 
associative cache. (Note that the cycle sequences 
are the same as those in Figure 4-5A and 4-5B.) Ina 
cache read hit, only one bank on the other is en- 
abled to drive the 386 SX data bus, so unlike the 
control of a direct mapped cache, the appropriate 
cache output enable cannot be driven until the out- 
come of the hit/miss decision is known. (This im- 
plies stricter SRAM timing requirements for a two- 
way set associative cache.) In write hits and read 
misses, only one bank or the other is updated. 


4.3 387 SX Interface 


The 387 SX Math Coprocessor interfaces to the 386 
SX just as it would in a system without an 82385SxX. 
The 387 SX READYO# output is logically “AND” ed 
along with all other 386 SX local bus ready sources 
(Figure 4-1), and the output is fed to the 387 SX 
READY#, 82385SX READYI#, and 386 SX 
READY # inputs. 


82385SX 


The 386 SX uniquely addresses the 387 SX by driv- 
ing M/lIO# low and A23 high. The 82385SX de- 
codes this internally and treats 387 SX accesses in 
the same way it treats 386 SX cycles in which LBA# 
is asserted, it ignores them. 


5.0 82385SX LOCAL BUS AND 
SYSTEM INTERFACE 


The 82385SX system interface is the 82385SX Lo- 
val Bus, which presents a “386 SX-like”’ front end to 
the system. The system ties to it just as it would to a 
386 SX. Although this 386 SX-like front end is func- 
tionally equivalent to a 386 SX, there are timing dif- 
ferences which can easily be accounted for in a sys- 
tem design. 


The following is a description of the 82385SX sys- 
tem interface. After presenting the 82385SX bus 
state machine, the 82385SX bus signals are de- 
scribed, as are techniques for accommodating any 
differences between the 82385SX bus and 386 SX 
bus. Following this is a discussion of the 82385SX’s 
condition upon reset. 


5.1 The 82385SX Bus State Machine 


5.1.1 MASTER MODE 


Figure 5-1A illustrates the 82385SX bus state ma- 
chine when the 82385SX is programmed in master 
mode. Note that it is almost identical to the 386 SX 
bus state machine, only the bus states are 82385SX 
bus states (BT1P, BTH, etc.) and the state tran- 
sitions are conditioned by 82385SX bus inputs 
(BNA# BHOLD, etc.). Whereas a “pending request’ 
to the 386 SX state machine indicates that the 
386 SX execution or prefetch unit needs bus access, 
a pending request to the 82385SX state machine 
indicates that a 386 SX bus cycle needs to be for- 
warded to the system (read miss, non-cacheable 
read, write, etc.). The only difference between the 
state machines is that the 82385SX does not imple- 
ment a direct BT1P-BT2P transition. If BNA# is as- 
serted in BT1P, the resulting state sequence is 
BT1P-BT2I-BT2P. The 82385SX’s ability to sustain a 
pipeline is not affected by the lack of this transition. 


5.1.2 SLAVE MODE 


The 82385SX’s slave mode state machine (Figure 
5-1B) is similar to the master mode machine except 
that now tran.itions are conditioned by BHLDA rath- 
er than BHOLD. (Recall that in slave mode, the roles 
of BHOLD and BHLDA are reversed from their mas- 
ter mode roles.) Figure 5-2 clarifies slave mode state 
machine operation. Upon reset, a slave mode 
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82385SX enters the BTH state. When the 386 SX of 
the slave 82385SX subsystem has a. cycle that 
needs to be forwarded to the system, the 82385SX 
moves to BTI and issues a hold request via BHOLD. 
It is important to note that a slave mode 82385SX 
does not drive the bus in a BTI state. When the mas- 
ter or bus arbiter returns BHLDA, the slave 82385SX 
enters BT1 and runs the cycle. When the cycle is 
completed, and if no additional requests are pend- 
ing, the 82385SX moves back to BTH and disables 
BHOLD. 
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If, while a slave 82385SX is running a cycle, the - 
master or arbiter drops BHLDA (Figure 5-2B), the 
82385SX will complete the current cycle, move to 
BTH and remove the BHOLD request. If the 
82385SxX still had cycles to run when it was kicked 
off the bus, it will immediately assert a new BHOLD 
and move to BTI to await bus acknowledgement. 


Note, however, that it will only move to BTI if BHLDA 


is negated, insuring that the handshake. sequence is 
completed. 
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Figure 5-1A. 82385SX Local Bus State Machine—Master Mode 
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Figure 5-1B. 82385SX Local Bus State Machine—Slave Mode 
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_ Figure 5-2. BHOLD/BHLDA—Slave Mode 


There are saver cases in which a slave 82385SX 
will not immediately release the’ bus if BHLDA is 
dropped. For example, if BHLDA is dropped during a 


BT2P state, the 82385SX has already committed to: 


the next system bus pipelined cycle and will execute 
it before releasing the bus. Also, the 82385SX will 
complete a. sequence of locked cycles before re- 
leasing the bus. This should not present any prob- 
lems, as a properly designed arbiter will not assume 
that the 82385SX has released the bus until it sees 
BHOLD become inactive. 


5.2 The 82385SX Local Bus 


The 82385SX bus can be broken up into two groups 
of signals: those which have direct 386 SX counter- 
parts, and additional status and control signals pro- 
vided by the 82385SX. The operation and interaction 
of all 82385SX bus signals are depicted in Figures 
5-3A through 5-3L for a wide variety of cycle se- 
quences. These diagrams serve as a reference for 
the 82385SX bus discussion and provide insight into 
the dual bus operation of the 82385SX. | 
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Figure 5-3A. Consecutive SBRD Cycles—(N = 0) 
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Figure 5-3B. Consecutive CRDM Cycles—(N = 1) 
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Figure 5-3C. SBRD, CRDM, SBRD—(N = 2) 
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Figure 5-3D. SBRD Cycles Interleaved with BTH States—(N = 1) 
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Figure 5-3E. Interleaved SBRD/CDRH Cycles—(N = 1) 
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Figure 5-3F. SBRD, WT, SBRD, CRDH—(N = 1) 
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Figure 5-3G. Interleaved WT/CRDH Cycles—(N = 1) 
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Figure 5-3H. WT, WT, CRDH—(N = 1) 
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Figure 5-31. WT, WT, SBRD—(N = 1) 
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Figure 5-3K. LOCK #/BLOCK# in Non-Cacheable or Miss Cycles—(N = 1) 
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Figure 5-3L. LOCK #/BLOCK # in Cache Read Hit Cycle—(N = 1) 


5.2.1 82385SX BUS COUNTERPARTS TO 
386™ SX SIGNALS 


The following sections discuss the signais presented 
on the 82385SX local bus which are functional 
equivalents to the signals present at the 386 SX lo- 
cal bus. 


5.2.1.1 Address Bus (BA1-BA23) 
and Cycle Definition Signals 
(BM/IO#, BD/C#, BW/R#) 


These signals are not driven directly by the 
82385SX, but rather are the outputs of the 74374 
_address/cycle definition latch. (Refer to Figure 4-1 
for the hardware interface.) This latch is controlled 
by the 82385SX BACP and BAOE# outputs. The 


behavior and timing of these outputs and the latch — 
they control (typically F or AS series TTL) ensure | 


that BA1—BA23, BM/lIO#, BW/R#, and BD/C# are 
compatible in timing and function to their 386 SX 
counterparts. 


The behavior of BACP can be seen in Figure 5-3B, 
where the rising edge of BACP latches and forwards 
the 386 SX address and cycle definition signals in a 
BT1 or first BT2P state. However, the 82385SX 


‘need not bé the current bus master to latch. the 


386 SX address, as evidenced by cycle 4 of Figure 
5-3A. In this case, the address is latched in frame 8, 


but not forwarded to the system (via BAOE #) until 


frame 10. (The latch and output enable functions of 


_ the 74374 are independent and invisible to one an- 
other.) 


Note that in frames 2 and 6 the BACP pulses are 
marked “‘False’’. The reason is that BACP is issued 


-and the address latched before the hit/miss deter- 


mination is made. This ensures that should the cycle 
be a miss, the 82385SX bus can move directly into 
BT1 without delay. In the case of a hit, the latched 


- address is simply never qualified by the assertion of 
‘BADS #. The 82385SX bus stays in BTI if there is no 


access pending (new cycle is a hit) and no bus activ- 
ity. It will move to and stay in BT2l if the system has 


requested a pipelined cycle and the 82385SX does 


not have a pending bus access (new cycle is a hit). 
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5.2.1.2 Data Bus (BDO-BD15) 


The 82385SX data bus is the system side of the 
74646 latching transceiver. (See Figure 4-1.) This 
. device is controlled by the 82385SX outputs LDSTB, 
DOE#, and BT/R#. LDSTB latches data in write 
cycles, DOE# enables the transceiver outputs, and 
BT/R# controls the transceiver direction. The inter- 
action of these signals and the transceiver is such 
that BDO-BD15 behave just like their 386 SX coun- 
terparts. The transceiver is configured such that 
data flow in write cycles (A to B) is latched, and data 
flow in read cycles (B to A) is flow-through. 


Although BDO-—BD15 function just like their 386 SX 
counterparts, there is a timing difference that must 
be accommodated for in a system design. As men- 
tioned above, the transceiver is transparent during 
read cycles, so the transceiver propagation delay 
must be added to the 386 SX data setup. In addition, 
the cache SRAM setup must be accommodated for 
in cache read miss cycles. 


For non-cacheable reads the data setup is given by: 


Min BDO-BD15 _ 386SXMin 74646 B-to-A 
ReadDataSetup DataSetup " MaxPropagation Delay 


The required BDO-BD15 setup in a cache read miss 
is given by: 


Min BDO-BD15 _ 74646B-to-A Cache SRAMMin 
ReadDataSetup MaxPropagationDelay WriteSetup 
One CLK2 _ 82385SX CWEA# or 


Period CWEB # Min Delay 

If a data buffer is located between the 386 SX data 
bus and the cache SRAMs, then its maximum propa- 
gation delay must be added to the above formula as 
well. A design analysis should be completed for ev- 
ery new design to determine actual margins. 


A design can accommodate the increased data set- 
up by choosing appropriately fast main memory 
DRAMs and data buffers. Alternatively, a designer 
may deal with the longer setup by inserting an extra 
wait state into cache read miss cycles. If an addition- 
al state is to be inserted, the system bus controller 
should sample the 82385SX MISS # output to distin- 
quish read misses from cycles that do not require 
the longer setup. Tips on using the 82385SX MISS # 
signal are presented later in this chapter. 


The behavior of LDSTB, DOE#, and BT/R# can be 
understood via Figures 5-3A through 5-3L. Note that 
in cycle 1 of Figure 5-3A (A non-cacheable system 
read), DOE# is activated midway through BT1, but 
in cycle 1 of Figure 5-3B (a cache read miss), DOE # 
is not activated until midway through BT2. The rea- 
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son is that in a cacheable read cycle, the cache 
SRAMs are enabled to drive the 386 SX data bus 
before the outcome of the hit/miss decision (in an- 
ticipation of a hit.) In cycle 1 of Figure 5-3B, the as- 
sertion of DOE# must be delayed until after the 
82385SX has disabled the cache output buffers. The 
result is that N=0O main memory should not be 
mapped into the cache. 


5.2.1.3 Byte Enables (BBHE #, BBLE #) 


These outputs are driven directly by the 82385Sx, 
and are completely compatible in timing and function 
with their 386 SX counterparts. When a 386 SX cy- 
cle is forwarded to the 82385SX. bus, the 386 SX 
byte enables are duplicated on BBHE# and 
BBLE#. The one exception is a cache read miss, 
during which BBHE# and BBLE#¥ are both active 
regardless of the status of the 386 SX byte enables. 
This ensures that the cache is updated with a valid 
16-bit entry. 


5.2.1.4 Address Status (BADS#) 


BADS# is identical in function and timing to its 
386 SX counterpart. It is asserted in BT1 and BT2P 
states, and indicates that valid address and cycle 
definition (BA1—BA23, BBHE#, BBLE#, BM/IO#, 
BW/R#, BD/C#) information is available on the 


82385SX bus. 


5.2.1.5 Ready (BREADY #) 


The 82385SX BREADY # input terminates 82385SX 
bus cycles just as the 386 SX READY # input termi- 
nates 386 SX bus cycles. The behavior of 
BREADY# is the same as that of READY#, but 
note in the A.C timing specifications that a cache 
read miss requires a longer BREADY# setup than 
do other cycles. This must be accommodated for in 
ready logic design. 


5.2.1.6 Next Address (BNA #) 


BNA# is identical in function and timing to its 
386 SX counterpart. Note that in Figures 5-3A 
through 5-3L, BNA# is assumed asserted in every 
BT1P or first BT2 state. Along with the 82385SX’s 
pipelining of the 386 SX, this ensures that the timing 
diagrams accurately reflect the full pipelined nature 
of the dual bus structure. 


5.2.1.7 Bus Lock (BLOCK #) 
The 386 SX flags a locked sequence of cycles by 


asserting LOCK #. During a locked sequence, the 
386 SX does not acknowledge hold requests, so the 


§-1049 


intel 


sequence executes without interruption by another 
master. The 82385SX forces all locked 386 SX cy- 
cles to run on the 82385SX bus (unless LBA# is 
active), regardless of whether or not the referenced 
location resides in the cache. In addition, a locked 


sequence of 386 SX cycles is run as a locked se- 


quence on the 82385SX bus; BLOCK # is asserted 
and the 82385SX does not allow the sequence to be 
interrupted. Locked writes (hit or miss) and locked 
read misses affect the cache and cache directory 
just as their unlocked counterparts do. A locked read 
hit, however, is handled differently. The read is nec- 
essarily forced to run on the 82385SX local bus, and 
as the data returns from main memory, it is “re-cop- 
ied’’ into the cache. (See Figure 5-3L.) The directory 
is not changed as it already indicates that this loca- 
tion exists in the cache. This activity is invisible to 
the system and ensures that semaphores are prop- 
erly handled. 


BLOCK# is asserted during locked 82385SX bus 
cycles just as LOCK# is asserted during locked 
386 SX cycles. The BLOCK # maximum valid delay, 
however, differs from that of LOCK #, and this must 
be accounted for in any circuitry that makes use of 
BLOCK#. The difference is due to the fact that 
LOCK #, unlike the other 386 SX cycle definition sig- 
nals, is not pipelined. The situation is clarified in Fig- 
ure 5-3K. In cycle 2 the state of LOCK# is not 
known before the corresponding system read starts 
(Frame 4 and 5). In this case, LOCK # is asserted at 
the beginning of T1P, and the delay for BLOCK # to 
become active is the delay of LOCK# from the 
386 SX plus the propagation delay through the 
82385SX. This occurs because T1P and the corre- 
sponding BT1P are concurrent (Frame 5). The result 
‘is that BLOCK# should not be sampled at the end 
of BT1P. The first appropriate sampling point is mid- 
way through the next state, as shown in Frame 6. In 
Figure 5-3L, the maximum delay for BLOCK # to be- 
come valid in Frame 4 is the same as the maximum 
delay for LOCK # to become valid from the 386 SX. 
This is true since the pipelining issue discussed 
above does not occur. 


The 82385 should negate BLOCK# after: 
BREADY # of the last 82385 Locked Cycle was as- 
serted AND LOCK # turns inactive. 


This means that in a sequence of cycles which be- 
gins with a 82385 Locked Cycle and goes on with all 
the possible Locked Cycles (other 82385 cycles, 
idles, and local cycles), while LOCK # is continuous- 
ly active, the 82385 will maintain BLOCK# active 
continuously. Another implication is that in a Locked 
Posted Write Cycle followed by non-locked se- 
quence, BLOCK# is negated one CLK after 
BREADY # of the write cycle. In other 82385 Locked 
Cycles, followed by non-locked sequences, 
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BLOCK # is negated one CLK after LOCK # is nega- 
ted, which occurs two CLKs after BREADY # is as- 
serted. In the last case BLOCK# active moves by 
one CLK to the non-locked sequence. 


The arbitration rules of Locked Cycles are: 


MASTER MODE: 


BHOLD input signal is ignored when BLOCK# or 
internal lock (16-bit non-aligned cycle) are active. 
BHLDA output signal remains inactive, and BAOE # 
output signal remains active at that time interval. 


SLAVE MODE: 


The 82385 does not relinquish the system bus if 
BLOCK# or internal lock are active. The BHOLD 
output signal remains active when BLOCK # or inter- 
nal lock is active plus one CLK. The BHLDA input 
signal is ignored when BLOCK # or the internal lock 
is active plus one CLK. This means the 82385 slave 
does not respond to BHLDA inactivation. The 
BAOE# output signal remains active during the 
same time interval. | 


5.2.2 ADDITIONAL 82385SX BUS SIGNALS 


The 82385SX bus provides two status outputs and 
one control input that are unique to cache operation 
and thus have no 386 SX counterparts. The outputs 
are MISS# and WBS, and the input is FLUSH. 


5.2.2.1 Cache Read/Write Miss Indication 
(MISS #) 


MISS # can be thought of as an extra 82385SX bus 


cycle definition signal similar to BM/IO#, BW/R#, 
and BD/C#, that distinguishes cacheable read and 
write misses from other cycles. MISS #, like the oth- 
er definition signals, becomes valid with BADS# 
(BT1 or first BT2P). The behavior of MISS # is illus- 
trated in Figures 5-3B, 5-3C, and 5-3J. The 82385SX 
floats MISS# when another master owns the bus, 
allowing multiple 82385SXs to share the same node 
in multi-cache systems. MISS # should thus be light- 
ly pulled up (~ 20K) to keep it negated during hold 
(BTH) states. 


MISS # can serve several purposes. As discussed 
previously, the BDO-BD15 and BREADY# setup 
times in a cache read miss are longer than in other 
cycles. A bus controller can distinguish these cycles 
by gating MISS# with BW/R#. MISS# may also 
prove useful in gathering 82385SX system perform- 
ance data. 
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5.2.2.2 WRITE BUFFER STATUS (WBS) 


WBS is activated when 386 SX write cycle data is 
latched into the 74676 latching transceiver (via 
LDSTB). It is deactivated upon completion of the 
write cycle on the 82385SX bus when the 82385SX 
sees the BREADY# signal. WBS behavior is illus- 
trated in Figures 5-3F through 5-3J, and potential 
applications are discussed in Chapter 3. 


5.2.2.3 Cache Flush (FLUSH) 


FLUSH is an 82385SX input which is used to reset 
all tag valid bits within the cache directory. The 
FLUSH input must be kept active for at least 4 CLK 
(8 CLK2) periods to complete the directory flush. 
Flush is generally used in diagnostics but can also 
be used in applications where snooping cannot 
guarantee coherency. / 
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5.3 Bus Watching (Snoop) Interface 


The 82385SX’s bus watching interface consists of 
the snoop address (SA1-SA23), snoop strobe 
(SSTB#), and snoop enable (SEN) inputs. If mas- 
ters reside at the system bus level, then the SA1- 
SA23 inputs are connected to the system address 
lines and SEN the system bus memory write com- 
mand. SSTB # indicates that a valid address is pres- 
ent on the system bus. Note that the snoop bus in- 
puts are synchronous, so care must be taken to en- 
sure that they are stable during their sample win- 
dows. If no master resides beyond the 82385 bus 
level, then the 82385 inputs SA1—SA23, SEN, and 
SSTB# can respectively tie directly to BA1-—BA23, 
BW/R#, and BADS# of the other system bus mas- 
ter (see Figure 5.5). However, it is recommended 
that SEN be driven by the logical “AND” of BW/R# 
and BM/IO# so as to prevent I/O writes from un- 
necessarily invalidating cache data. 
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*2. SSTB# on the 82385SxX is tied directly to BADS# of the System Bus master. 
| *3. SEN on the 82385SxX is tied directly to BW/R# of the System Bus master. 


Figure 5.4. Interleaved Snoop and 386™ SX Accesses to the Cache Directory 
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Figure 5.5. Snooping Connections in a Multi 
Master Environment | 


When the 82385SX detects a system write by anoth- 
er master and the conditions in Figure 5.4 are met: 
CLK2 PHI1 rising (CLK falling), BHLDA asserted, 
SEN asserted, SSTB# asserted, it internally latches 
SA1-SA23 and runs a cache look-up to see if the 
altered main memory location is duplicated in the 
cache. If yes (a snoop hit), the line valid bit asso- 
ciated with that cache entry is cleared. An important 
feature of the 82385SX is that even the 386 SX is 
running zero wait state hits out of the cache, all 
snoops are serviced. This is accomplished by time 
multiplexing the cache directory between the 386 SX 
address and the latched system address. If the 
SSTB¥# signal occurs during an 82385SX compari- 
son cycle (for the 386 SX), the 386 SX cycle has the 
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highest priority in accessing the cache directory. 
This takes the first of the two 386 SX states. The 
other state is then used for the snoop comparison. 
This worst case example, depicted in Figure 5.4, 
shows the 386 SX running zero wait state hits on the 
386 SX local bus, and another master running zero 
wait state writes on the 82385SX bus. No snoops 
are missed, and no performance penalty incurred. 


5.4 Reset Definition 


Table 5-1 summarizes the states of all 82385SX out- 
puts during reset and initialization. A slave mode 
82385SxX tri-states its “386 SxX-like” front end. A 
master mode 82385SX emits a pulse stream on its 
BACP output. As the 386 SX address and cycle defi- 
nition lines reach their reset values, this stream will 
latch the reset values through to the 82385SX bus. 

Table 5-1. Pin State during RESET and Initialization 


Signal Level during 
RESET and Initialization 


Master Mode 
NA# | _High 
READYO# High 
BRDYEN# High | High | 
CALEN High 
CWEA#-CWEB# High 


CSO#, CS1# Low 
CT/R# High 
COEA#-COEB# High 
BADS# High 
BBHE#, BBLE# 386 BE# | 
BLOCK# High 


Output 
Name 


MISS# : High 
BACP Pulse!) | Pulse | 
BAOE# Low 
BT/R# Low 
DOE # 
LDSTB 
BHOLD 
BHLDA 
WBS 

NOTE: 


1. In Master Mode, BAOE # is low and BACP emits a pulse 
stream during reset. As the 386 SX address and cycle defi- 
nition signals reach their reset values, the pulse stream on 
BACP will latch these values through to the 82385SX local 
bus. 
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6.0 82385SX SYSTEM DESIGN 
CONSIDERATIONS 


6.1 Introduction 


This chapter discusses techniques which should be 
implemented in an 82385SX system. Because of the 
high frequencies and high performance nature of the 
386 SX CPU/82385SX system, good design and lay- 
out techniques are necessary. It is always recom- 
mended to perform a complete design analysis of 
new system designs. 


6.2 Power and Grounding 


6.2.1 POWER CONNECTIONS 


The 82385SxX utilizes 8 power (Vcc) and 10 ground 
(Vss) pins. All Voc and Vgss pins must be connected 
to their appropriate plane. On a printed circuit board, 
all Vcc pins must be connected to the power plane 
and all Vss pins must be connected to the ground 
plane. _ 


6.2.2 POWER DECOUPLING 


Although the 82385SxX itself is generally a “passive” 
device in that it has a few output signals, the cache 
subsystem as a whole is quite active. Therefore, 
many decoupling capacitors should be placed 
around the 82385SX cache subsystem. 


Low inductance capacitors and interconnects are 
recommended for best high frequency electrical per- 
formance. Inductance can be reduced by shortening 
circuit board traces between the decoupling capaci- 
tors and their respective devices as much as possi- 
ble. Capacitors specifically for PGA packages are 
also commercially available, for the lowest possible 
inductance. : 


6.2.3 RESISTOR RECOMMENDATIONS 


Because of the dual structure of the 82385SX sub- 
system (386 SX Local Bus and 82385SX Local Bus), 
any signals which are recommended to be pulled up 
will be respective to one of the busses. The follow- 
ing sections will discuss signals for both busses. 


6.2.3.1 386 SX LOCAL BUS 


For typical designs, the pullup resistors shown in Ta- 
ble 6-1 are recommended. This table correlates to 
Chapter 7 of the 386 SX Data Sheet. However, par- 
ticular designs may have a need to differ from the 
listed values. Design analysis is recommended to 
determine specific requirements. 
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6.2.3.2 82835SX Local Bus 


Pullup resistor recommendations for the 82385SX 
Local Bus signals are shown in Table 6-2. Design 
analysis is necessary to determine if deviations to 
the typical values given are needed. 


Table 6-1. Recommended Resistor Pullups 
to Vcc (386T™ SX Local Bus) 


Pin and Pullup neat 
Signal Value pues 


ADS# 20 KO +10% | Lightly Pull ADS # 
PGA E13 Negated for 386 SX 
PQFP 123 | Hold States 


LOCK# 
PGA F13 
PQFP 118 


Lightly Pull LOCK # 
Negated for 386 SX 
Hold States 


20 KX +10% 


Table 6-2. Recommended Resistor Pullups 
to Vcc (82385SX Local Bus) 


Signal Pullup 


BADS # 20 KX +10% ; Lightly Pull BADS# 

PGA N9 Negated for 

PQFP 89 82385SX Hold 
‘States | 


BLOCK# | 20K. +10% | Lightly Pull 

PGA P9 | BLOCK # Negated 

PQFP 86 for 82385SX Hold 
States 


MISS# | 20KQ +10% | Lightly Pull MISS # 

PGA N8 Negated for | 

PQFP 85 82385SX Hold 
States 


6.3 82385SX Signal Connections - 


6.3.1 CONFIGURATION INPUTS 


The 82835 configuration signals (M/S#, 2W/D#, 
DEFOE #) must be connected (pulled up) to the ap- 
propriate logic level for the system design. There is 
also a reserved 82385SxX input which must be tied to 
the appropriate level. Refer to Table 6-3 for the sig- 
nals and their required logic level. 
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Table 6-3. 82385SX Configuration 


S Inputs Logic Levels | 
-Pinand | Logic | 4. 
M/S# | 


Level 
PGA B13 
PQFP 124 


2W/D# 
PGA D12 
PQFP 127 


| Reserved 
PGA L14 
PQFP 102 


- DEFOE# 
| PGAA14 
PQFP 128 


Must be tied to Vcc via 
a pull-up for proper 
functionality 


Define Cache Output 
Enable. Allows use of 
any SRAM. 

NOTE: _ 


The listed 82385SX pins which need to be tied high should 
use a pull-up resistor in the range of 5 KO. to 20 KN. 


6.3.2 CLK2 and RESET 


The 82385SX has two inupts to which the 386 SX 
CLK2 signal must be connected. One is labeled 
CLK2 (82385SX pin C13) and the other is labeled 
BCLK2 (82385SX pin L13). These two inputs must 
be tied together on the printed circuit board. 


The 82385SX also has ‘two reset inputs. RESET 
(82385SX pin D13) and BRESET (82385SX pin K12) 
must be connected on the printed circuit board. 


Read Cycle Requirements 
Address Access (MAX) 
Chip Select Access (MAX) 
OE # to Data Valid (MAX) _. 
OE # to Data Float (MAX) 


Write Cycle Requirements 
Chip Select to End of Write (MIN) 
| . Address Valid to End of Write (MIN) 
Write Pulse Width (MIN) we 
Data Setup (MAX) > 
Data Hold (MIN) 


_ Table 6-4. SRAM Specs for Non-Buffered Cache Memory 
_ SRAM Spec Requirements | 


_ Direct Mapped 
16 MHz 
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6.4 Unused Pin Requirements 


For reliable operation, ALWAYS connect unused in- 
puts to a valid logic level. As is the case with most 
other CMOS processes, a floating input will increase 
the current consumption of the component and give 
an indeterminate state to the component. 


6.5 Cache SRAM Requirements 


The 82385SxX offers the option of using SRAMs with 
or without an output enable pin. This is possible by 
inserting a transceiver between the SRAMs and the 
386 SX local data bus and strapping DEFOE # to the 
appropriate logic level for a given system configura- 
tion. This transceiver may also be desirable in a sys- 
tem which has a very heavily loaded 386 SX local 
data bus. The following sections discuss the SRAM 
requirements for all cache configurations. 


6.5.1 CACHE MEMORY WITHOUT 


TRANSCEIVERS 


As discussed in Section 3.2, the 82385SX presents 
all of the control signals necessary to access the . 
cache memory. The SRAM chip selects, write en- 
ables, and output enables are driven directly by the 
82385SX. Table 6-4 lists the required SRAM specifi- 
cations. These specifications allow for zero margins. 
They should be used as guides for the actual system: 
design. | 


2-Way Set Associative 


20 MHz 16 MHz 20 MHz 
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6.5.2 CACHE MEMORY WITH TRANSCEIVERS 


To implement an 82385SX subsystem using cache 
memory transceivers, COEA# or COEB# must be 
used as output enable signals for the transceivers 
and DEFOE# must be appropriately strapped for 
proper COEx# functionality (since the cache SRAM 
transceivers must be enabled for writes as well as 
reads). DEFOE# must be tied high when using 
cache SRAM transceivers. In a 2-way set associa- 
tive organization, COEA# enables the transceiver 
for bank A and COEB# enables the bank B trans- 
ceiver. A direct mapped cache may use either 
COEA# or COEB # to enable the transceiver. Table 
6-5 lists the required SRAM specifications. These 
specifications allow for zero margin. They should be 
used as guides for the actual system design. 


7.0 SYSTEM TEST CONSIDERATIONS 


7.1 Introduction - 


Power On Self Testing (POST) is performed by most 
systems after a reset. This chapter discusses the 
requirements for properly testing an 82385SX based 
system after power up. 


7.2 Main Memory (DRAM) Testing 


Most systems perform a memory test by writing a 
data pattern and then reading and comparing the 
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data. This test may also be used to determine the 
total available memory within the system. Without 
properly taking into account the 82385SX cache 
memory, the memory test can give erroneous re- 
sults. This will occur if the cache responds with read 
hits during the memory test routine. 


7.2.1 MEMORY TESTING ROUTINE 


In order to properly test main memory, the test rou- 
tine must not read from the same block consecutive- 
ly. For instance, if the test routine writes a data pat- 
tern to the first 16 Kbytes of memory (0000- 


_3FFFH), reads from the same block, writes a new 


pattern to the same locations (OQO0OO-3FFFH), and 
read the new pattern, the second pattern tested 
would have had data returned from the 82385SX 
cache memory. Therefore, it is recommended that 
the test routine work with a memory block of at least 
32 Kbytes. This will guarantee that no 16 Kbyte 
block will be read twice consecutively. 


7.3 82385SX Cache Memory Testing 


With the addition of SRAMs for the cache memory, it 
may be desirable for the system to be able to test 
the cache SRAMs during system diagnostics. This 
requires the test routine to access only the cache 
memory. The requirements for this routine are based. 
on where it resides within the memory map. This can 


Table 6-5. SRAM Specs for Buffered Cache Memory 


Read Cycle Requirements 
Address Access (MAX) 
Chip Select Access (MAX) 
OE # to Data Valid (MAX) 
OE # to Data Float (MAX) 


Write Cycle Requirements - 
Chip Select to End of Write (MIN) 
Address Valid to End of Write (MIN) 
Write Pulse Width (MIN) 

Data Setup (MAX) 

Data Hold (MIN) 


| SRAM Spec Requirements 


| Direct Mapped 2-Way Set Associative 
16 MHz 20 MHz 16 MHz | 20 MHz 
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be broken into two areas: the routine residing in 

cacheable memory space or the routine residing in 

either non-cacheable memory or on the Bee SX lo- 
cal bus vane the LBA# oa ae 


7.3.1 TEST ROUTINE IN THE NCA# OR LBA# > 
MEMORY MAP 


In this configuration, the test routine will never be 
cached. The recommended method is code which 
will access a single 16 Kbyte block during the test. 
Initially, a 16 Kbyte read (assume O000-3FFFH) 
must be executed. This will fill the cache directory 


with the address information which will be used in- 


the diagnostic procedure. Then, a 16 Kbyte write to 
the same address locations (OO0OO0O-—3FFFH) will load 
the cache with the desired test pattern (due to write 
hits). The comparison can be made by completing 
another 16 Kbyte read (same locations, O000- 
3FFFH), which will be cache read hits. Subsequent 
writes and reads to the same addresses will enable 
various patterns to be tested. 


7.3.2 TEST ROUTINE IN CACHEABLE MEMORY 


In this case, it must be understood that the diagnos- 
tic routine must reside in the cache memory before 
the actual data testing can begin. Otherwise, when 
the 386 SX performs a code fetch, a location within 
‘the cache memory which is to be tested will be al- 
tered due to the read miss (code fetch) update. 


The first task is to load the diagnostic routine into 
the top of the cache memory. It must be known how 
much memory is required for the code as the rest of 
the cache memory will be tested as in the earlier 
method. Once the diagnostics have been cached 
(via read updates), the code will perform the same 
type of. read/write/read/compare as in the routine 
explained in the above section. The difference is 
_ that now the amount of cache memory to be tested 
is 16 Kbytes minus the length of the test routine. — 


7.4 82385SX Cache Directory Testing 


Since the 82385SX does not directly access the 
data bus, it is not possible to easily complete a com- 
parison of the cache directory. (The 82385SX can 
serially transmit its directory contents. See Section 
7.5.) However, the cache memory tests.described in 
Section 7.3 will indicate if the directory is working 
properly..Otherwise,. the data comparison within the 
diagnostics will show locations which fail. 


There is a slight possibility that the cache memory 


comparison could pass even if locations within the 
directory gave false hit/miss results. This could 
cause the comparison to always be performed to 
main memory instead of the cache and give a proper 


82385SX 


comparison to the 386 SX. The solution here is to 
use the MISS # output of the 82385SX as an indica- 
tor to a diagnostic port which can be read by the 
386 SX. It could also be used to Hage an Le 2 if a 
failure occurs. | 


The saplenientaticn: of these iecnnigues' in the diag- 
nostics will assure proper enoneny, of the 
82385SX sage uals | 


7.5 Special Function Pins . 
As mentioned in Chapter 3, there are three 82385SX 
pins which have reserved functions in addition to 


their normal operational functions.. These pins are 
MISS#, WBS, and FLUSH. — 


As discussed previously, the 82385SX performs a 
directory flush when the FLUSH input is held active 
for at least 4 CLK (8 CLK2) cycles. However, the 
FLUSH pin also serves as a diagnostic input to the 
82385SX. The 82385SX will enter a reserved mode 
if the FLUSH pin is se at the falling edge of RE- 
SET. 


|f, during normal operation, the FLUSH input is ac- 


tive for only one CLK (2 CLK2) cycle/s, the 82385SX 
will enter another reserved mode. Therefore it must 
be guaranteed that FLUSH is active for at least the 4 
CLK (8 CLK2) cycle specification. | 


WBS and MISS# serve as outputs in the 82385SX 
reserved modes. 


8.0 MECHANICAL DATA 


8.1 Introduction 


This chapter discusses the physical package and its 
connections in detail. 


8.2 Pin Assignment — 


The 82385SX PGA pinout as viewed from the top 
side of the component is shown by Figure 8-1. Its 
pinout as viewed from the Pin side of the component 
is shown in Figure 8-2... | 


The 82385SxX Plastic Quad Flat Pack (PQFP) pinout 


from the top side of the component is } shown Py Fig- 


— ure 8-3. 


Vcc and Vss connections must be made to multiple 
Voc and Vss (GND) pins. Each Vcc and Vss must 
be connected to the appropriate voltage level. The 
circuit board should include Vcc and GND planes for 
power distribution and all Vcc and Vss pins must be 
connected to the appropriate plane. 
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Figure 8-1. 82385SX PGA Pinout—View from TOP Side 
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Figure 8-2. 82385SX PGA Pinout—View from PIN Side 
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Figure 8-3. 82385SX PQFP Pinout—View from TOP Side 
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Table 8-1. 82385SX Pinout—Functional Grouping 


114. NCA#’ 
113 LBA# 

122 READYI# 
66 READYO# 


70 CT/R# 


83 BRDYEN#| 

105 BREADY# 

91 BACP 

90 BAOE# 
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SA17 
SA16 
SA15 
SA14 
SA13 
SA12 
SA11 
SA10 
SAQ 
SA8 
SA7 
SA6 
BLE # SA5 
BHE # SA4 
| SA3 
CLK2 | SA2 
RESET SA1 
BRESET SEN # 
BCLK2 : SSTB # 


Vcc (*) 
Vcc (*) 
Voc (*) 
Voc (*) 
Vec (*) 
Vcc (*) 
Voc (*) 
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8.3 Package Dimensions and 
Mounting 


The 82385SX PGA package is a 132-pin ceramic 
Pin Grid Array. The pins are arranged 0.100 inch 
(2.54 mm) center-to-center, in a 14 x 14 matrix, 
three rows around. 


A wide variety of available PGA sockets allow low 
insertion force or zero insertion force mounting. 


-0 
.050 (1.269) 


.150 (3.807) 
.250 (6.345) 
.350 (8.883) 
450 (11.421) 


.020 (0.508) 
_ MIN TYP 
.070 (1.777) DIA 
TYP BRAZE PAD 
<—________——. 1.450 (36.802) 


550 (13.959) 
— .650 (16.497) 
— .725 (18.401) 


82385SX 


These come in a choice of terminals such as solder- 
tail, surface mount, or wire wrap. 


The 82385SX PQFP is a 132-lead Plastic Quad 
Flat Pack. The pins are “fine pitch’, 0.025 inches 
(0.635 mm) center to center. 


The PQFP device is intended to be surface mounted 
directly to the printed board although sockets are 
available for this device. 


.057 (1.269) —> 

725 (18.401) 

.650 (16.497) 
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Figure 8-3.1. 132-Pin PGA Package Dimensions 
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Figure 8-3.2. Principal Dimensions and Datums 
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Figure 8-3.3. Molded Details 
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Figure 8-3.4. Terminal Details 
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Figure 8-3.5. Typical Lead 
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Figure 8-3.6. Detail M 


PLASTIC QUAD FLAT PACK 


Table 8-3.1. Symbol List for Plastic Quad Flat Pack 


Description of Dimensions 


Package height: distance 
from seating plane to 
highest point of body 


Standoff: Distance from 
seating plane to base plane 
Overall package dimension: 
lead tip to lead tip 
D1/E1 Plastic body dimension 
D3/E3 Footprint 


Total number of leads 


Letter or 
_e 


NOTES: 
1. All dimensions and tolerances conform to ANSI Y14.5M- 
1982. 


2. Datum plane -H- located at the mold parting line and 


coincident with the bottom of the lead where lead exits 
plastic body. 


3. Datums A-B and -D- to be determined where center 


leads exit plastic body at datum plane -H-. 

4. Controlling Dimension, Inch. 

5. Dimensions D1, D2, E1 and E2 are measured at the 
mold parting line and do not include mold protrusion. Al- 
lowable mold protrusion of 0.18 mm (0.007 in) per side. 

6. Pin 1 identifier is located within one of ne two zones 
indicated. 

7. Measured at datum plane -H-. 

8. Measured at seating plane datum -C-. 
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Table 8-3.2. PQFP Dimensions and Tolerances 


Intel Case Outline Drawings intel Case Outline Drawings 
Plastic Quad Flat Pack Plastic Quad Flat Pack 
0.025 Inch Pitch 0.64 mm Pitch 
Description | | Min Symbol Description | Min 


Leadcount 132 
| 4.06 

— 0.51 
27.31 


Leadcount | 132 

0.160 ; 0.170 
0.020 | 0.030 
1.075 | 1.085 
0.947 | 0.953 


4.32 
0.76 
27.56 
24.05 | 24.21- 
27.86 | 28.02 
20.32 REF 

Foot Length 0.51 0.76 
IWS Preliminary 1/15/87 


Ny 
cx 
ra 
Coe 


D2, E2 
D3, E3 


A 

A1 

D,E 
(eee) 


Package Height 
Standoff , 
Terminal Dimension 


Package Height 
Standoff 
Terminal Dimension 


Package Body 
Bumper Distance 1.097 | 1.103 
Lead Dimension 0.800 REF 
Foot Length 0.030 
[Issue 


IWS Preliminary 1/15/87 


| Symbol _ 
Ca 
PA 
oe 


Package Body | 


Bumper Distance 


Lead Dimension 


MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 


132—PIN PGA 
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Figure 8-3.7. Measuring 82385SX PGA Case Temperature 
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Table 8-3.3. 82385SX PGA Package Typical Thermal Characteristics 


Thermal Resistance—°C/Watt 


Airflow—f3/min (m3/sec) 
Parameter 50 100 200 400 600 800 
(0.25) (0.50) (1.01) (2.03) (3.04) (4.06) 
(wee 
bsGS GEES 
et td 


6 Junction-to-Case 
(Case Measured as Figure 8-3.7) 


6 Case-to-Ambient 18 17 
(No Heatsink) 
6 Case-to-Ambient . 


(with Omnidirectional Heatsink) 


@ Case-to-Ambient 
(with Unidirectional Heatsink) 


NOTES: 
1. Table 8-3.4 applies to 82385SX PGA plugged into socket or soldered directly onto board. 
2. Oya = Oyo + Oa. 
3. 8j.cap = 4°C/W (approx.) 
0j-piIn = 4°C/W (inner pins) (approx.) 
0).pin = 8°C/W (outer pins) (approx.) 


Thermal Resistance—°C/Watt 
Airflow—/LFM 


Parameter 400 600 800 
: (2.03) | (3.04) | (4.06) 
6 Junction-to-Case | , 5 5 5 5 
(Case Measured as Figure 8-3.7) 
0.5 11.5 
@ Case-to-Ambient | 
(with Omnidirectional Heatsink 
| q TO BE DEFINED 
6 Case-to-Ambient 
(with Unidirectional Heatsink) 


NOTES: 
1. Table 8-3.3 applies to 82385SX PQFP plugged into socket or soldered directly onto board. 
2. Ayn = Ajo + Oca. ; 
3. Oy.cap = 4°C/W (approx.) 
Oy-pin = 4°C/W (inner pins) (approx.) 
0j-Pin = 8°C/W (outer pins) (approx.) 
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8.4 Package Thermal Specification Supply Voltage _ 

g | P with Respect to Vss ........... —0.5V to +6.5V 
The case temperature should be measured at the Voltage on Any Other Pin ....—0.5V to Voc + 0.5V 
center of the top surface as in Figure 8-3.7 for PGA 
or Table 8-3.3 for PQFP. The case temperature may NOTE: 


be measured in any environment to determine 
whether or not the 82385SX is within the specified 
operating range. 


9.0 ELECTRICAL DATA 


9.1 Introduction 


This chapter presents the A.C. and D.C specifica- 
. tions for the 82385SX. : 


9.2 Maximum Ratings 


Storage Temperature ...... e+. 765°C to + 150°C 
Case Temperature under Bias ... —65°C to + 110°C 


Stress above those listed may cause permanent 
damage to the device. This is a stress rating only 
and functional operation at these or any other con- 
ditions above those listed in the operational sec- 
tions of this specification is not implied. 


Exposure to absolute maximum rating conditions for 
extended periods may affect device reliability. Al- 
though the 82385SX contains protective circuitry to 
resist damage from static electric discharges, al- 
ways take precautions against high static voltages 
or electric fields. : 


9.3 D.C. Specifications Tcase = 0°C to + 85°C; Voc = 5V 5%; Vgg = OV 
Table 9-1. D.C. Specifications (16 MHz and 20 MHz) 


NOTES: 
1. Minimum value is not 100% tested. 


2. Icc is specified with inputs driven to CMOS levels. Icc may be higher if driven to TTL levels. | 


3. Sampled only. 
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9.4 A.C. Specifications 


The A.C. specifications given in the following tables 
consist of output delays and input setup require- 
ments. The A.C. diagram’s purpose is to illustrate 
the clock edges from which the timing parameters 
are measured. The reader should not infer any other 
timing relationships from them. For specific informa- 


tion on timing relationships between signals, refer to | 


the appropriate functional section. 
A.C. spec measurement is defined in Figure 9-1. In- 


puts must be driven to the levels shown when A.C. 
specifications are measured. 82385SX output delays 


VALID 
OUTPUT n 


— C —rj<— DP — 


3.0V y 
We 
OV 


VALID 
INPUT et 


LEGEND: 

A—Maximum output delay specification 
B—Minimum output delay specification 
C—Minimum input setup specification 
D—Minimum input hold specification 


NOTES: 


\ 


€1.5V 


VAY 


82385SX 


are specified with minimum and maximum limits, 
which are measured as shown. 82385SxX input setup 
and hold times are specified as minimums and de- 
fine the smallest acceptable sampling window. With- 
in the sampling window, a synchronous input signal 


_ must be stable for correct 82385SX operation. 


9.4.1 FREQUENCY DEPENDENT SIGNALS 


The 82385SX has signals whose output valid delays 
are dependent on the clock frequency. These sig- 
nals are marked in the A.C. Specification Tables with 
a Note 1. 


VALID 


ouTpuT ne1 NOTE 4 


\ 
Y\ NOTE 2 


— 290222-47 


1. Under rated loading 82385SX output (t, and t,) is typically < 4.0 ns from 0.8V to 2.0V. 


2. Input waveforms have t, < 2.0 ns from 0.8V to 2.0V. 


Figure 9-1. Drive Levels and Measurement Points for A.C. Specification 
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A.C. SPECIFICATION TABLES Cv 
Functional nee i Voc = 5V 5%; Tcase = 0°C to + 85°C 


Table 4.1. A.C. — at 16 MHz 


CLK2, BCLK2 Period 
Coe Ty eae 
cue. gaukatign mee ft 

| 10 
atl 
Lae 
bic 


16 MH 


N 


CLK2, BCLK2 Rise Time 
121a2 
itd 
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3,9 
3,9 


wal, 


1 (25 pF Load) 
1 (25 pF Load) 
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30 
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30 
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A.C. SPECIFICATION TABLES (Continued) | 
Functional operating range: Vcc = 5V +5%; Tcase = O°C to + 85°C 


Table 4.1. A.C. Specifications at 16 MHz (Continued) 


[symbor | —Parameter | Min | Max 
[22 | CwexePusewah | a 
2aat | 081, C820 Rising PHI (CRM) | 6 | 41 
ese | _CS1#, 0824 Rising PHE(CWTH) | 6 | 41 
eae | _CS1¥, C82 Faling PHI (CwTH) | 6 | aT 
eae | _CS1#, C826 Faling, PHI2(CROM) | 6 | a 
Te4at | CT/Re Rising, PH2(CRDH) |e 
Tieaae | CT/Re Faling, PHI (CROH)———S«dY=C a 
[eas | CT/Re Faling, PHIZ(CROH) Sid | a 
[125c1 | COEKe Fising Delay @ Tease = 0 | 4 | 20 
[2502 | COEKe Rising Delay @ Toase = Twax | 4 | 20 
29 | COEx# FalingtoCSxe Rising | 0 | 
aa ee 
CWEx# Rising to COEx# Rising | | 


t28a 
t28b 
31 


CWEx# Rising to CSO#, CS1# Falling 


SA(1-23) Setup Time ae 
32 SA(1-23) Hold Time 
33 BADS # Valid Delay 
134 BADS # Float Delay 
135 BNA # Setup Time ee a 
136 BNA # Hold Time 15 fo 
7 BREADY # Setup Time ia. a 
138 BREADY # Hold Time 2 ae 
40a BACP Rising Delay / 0 | 26 
‘40b —'|_- BACP Fallling Delay Foo | 28 
41 BAOE # Valid Delay 


bee 

ae 

CWEx# Rising to CALEN Rising zs ae 
oe ae 

oe 


~s 
Go 
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A.C. SPECIFICATION TABLES (Continued) =e 
Functional operating range: Vcc = 5V. +5%; Tcase = 0°C to +.85°C 


Table 4.1. A.C. Specifications at 16 MHz (Continued) 


2 


NO 


| w | oo 


ié) 


2 
[we Hub Vaid ay 


33 


| t43b3 ~ DOE # Rising Delay @ Tcase = Tmax 3 


a | oo 
a}; o 


| 65a | _ BLOCK # Valid Delay 3 
| t55b1 -BBxE # Valid Delay 
t65b2 BBxE # Valid Delay 


—_— 
O1 
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ié%) 


— —_ — | ow 
NTN 


— t65b3° | #£BBxE# Valid Delay | 
| t65¢ | LOCK # Falling to BLOCK# Falling 


G 
o>) 


1,5 
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ioe) 


[se | wasvaidoeey 
Teo | FLUSH Hold Time 
Pee 
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A.C. SPECIFICATION TABLES 
Functional operating range: Vcc = 5V +5%; Tcase = O°C to + 85°C 
A.C. Specifications at 20 MHz 


Parameter | Notes 


Symbol 


= 
—, 


15.4 | 20 MHz 


ns 


Operating Frequency 
CLK2, BCLK2 Period — 
CLK2, BCLK2 High Time @ 2V 
3b CLK2, BCLK2 High Time @ 3.7V 
4a CLK2, BCLK2 Low Time @ 2V 
t4b CLK2, BCLK2 Low Time @ 0.8V 
CLK2, BCLK2 Fall Time — 
CLK2, BCLK2 Rise Time 
A4-A12 Setup Time 
t7a2 A1-A3, A13-A19, A21-A23 Setup Time -| 18 
t7a3 A20 Setup Time | 
t7b LOCK# Setup Time | 
{7c BLE #, BHE# Setup Time 
8 A1-A23, BLE#, BHE#, LOCK# Hold 
- M/IO#, D/C# Setup Time _ 
. W/R# Setup Time 

ADS # Setup Time 

tio = |.-s« M/IO#, D/C#, W/R#, ADS# Hold Time 

1 READYI# Setup Time 

t12 READYI# Hold Time 

ti3a1 NCA # Setup Time (See t55b2) | 

t13a2 NCA # Setup Time (See t55b3) 
| t13b LBA# Setup Time 

ti4a NCA# Hold Time 

t14b LBA# Hold Time 

t15 | RESET, BRESET Setup Time 
6 RESET, BRESET Hold Time 
t17 NA# Valid Delay 

8 READYO # Valid Delay 
t19 BRDYEN# Valid Delay 
t21a1 CALEN Rising, PHI1 
t21a2 CALEN Falling, PHI1 
t21a3 CALEN Falling in T1P, PHI2 
t21b CALEN Rising Following CWTH © 
CALEN Pulse Width 
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A.C. SPECIFICATION TABLES (Continuea) 
Functional operating range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 
A.C. Specifications at 20 MHz (Continued) 


Parameter Notes © 


t21d CALEN Rising to CS # Falling 
—t22a1. =| ~s CWEx¢# Falling, PHI1 (CWTH) 
t22a2. «| -s CWEx¢ Falling, PHI2 (CRDM) 


t22b CWEx# Pulse Width 
t22c1 CWEx# Rising, PHI1 (CWTH) 


a | 
” 


t22c2 CWEx# Rising, PHI2 (CRDM) 
t23a1 | CS1#,CS2# Rising, PHI1 (CRDM) 


” 


t23a2 _ CS1#, CS2# Rising, PHI2 (CWTH) 
t23a3 _CS1#, CS2# Falling, PHI1 (CWTH) 
t23a4 CS1#, CS2# Falling, PHI2 (CRDM) 


t24a1 CT/R# Rising, PHI2 (CRDH) _ 


t24a2 _CT/R# Falling, PHI1 (CRDH) 


13 
4 — 27 
4. |, 27 
30 
4. 27 
4 = OT 
37 
| 37 
6 37 
6 | 3 


= | | 


38 
38 


t24a3 CT/R# Falling, PHI2 (CRD) 
t25a COEA#, COEB # Falling (Direct) | 


~ (25 pF Load) 
1 (25. pF Load) 
(25 pF Load) 


24.5 - 
17 


. : 


_COEA#, COEB # Falling (2-Way) 
t25c COEx# Rising Delay 


CACHE SRAM WRITE CYCLES _ a 


t23b COEx# Falling to CSx# Rising _ 


t25d CWEx# Falling to COEx# Falling or 
CWEx# Rising to COEx# Rising | 


_CS0#, CS1 # Falling to CWEx# Rising | 30 — 
CWEx# Falling to CSO#, CS1# Falling | O. | 
CWEx# Rising to CALEN Rising _ 

— ) 3 ; 


4 
S 


— 8 (25 pF Load) 


oh, 


. 
co) 


26 
27. 
31 


t28b =| + CWEx¢# Rising to CSO#, CS1 # Falling 
SA(1-23) Setup Time | | 1 
‘SA(1-23) Hold Time 


t 


BADS # Valid Delay OO 
BADS# FloatDelay sy 


a” 


30 


BREADY # Setup Time 
all en 0. |. 22 


3 
4 
4 
15 


132 

133 
840 

135 

136 

137 


=] 
4) 


BACP Falling Delay 
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A.C. SPECIFICATION TABLES (Continued) 
Functional operating range: Vcc = 5V +5%; Tcase = 0°C to + 85°C 


A.C. Specifications at 20 MHz (Continued) 


Tr | BAER VelDomy 
[way | sstB+ Soup tine 
[us| SEN SSTB# HoldTine 
[we | BHOLDSeup Time 
[wr [BHOLDHold tine 
we | BHLDAVatoeey 
sr [iss xe, BLOGK Foe — 
Tso | FLUSH SetpTime 
er 
a 


oe ee = | = 
orn oO | @ 


— 
_—s 


as 
ol 


w 
oO 


“ «a “ = 


— — | ae] oe | ek |] ee 
OV; NI NIN. 


ie) 
on 


ie) 
N 


FLUSH Hold Time 
FLUSH Setup to RESET Low 
FLUSH Hold from RESET Low 


82385SX A.C. Specification Notes: 
1. Frequency dependent specifications. 


Go 
NO 


_ 2. Used for cache data memory (SRAM) specifications. 


3. This parameter is sampled, not 100% tested. Guaranteed by design. 
5. BLOCK# delay is either from BPHI1 or from 386 LOCK#. Refer to Figures 5-3K and 5-3L in the 82385SX data sheet. 
6. NCA# setup time is now specified to the rising edge of BPHI2 in the state after 386 SX addresses become valid (either 
the state after the first T2 or after the first T2P). 
7. BBxE# Valid Delay is a function of NCA# setup. 
BBxE# valid delay: 
t55b1 For cacheable system bus accesses 
t55b2 For NCA# setup < ti3a1 
t55b3 For t18a2 < NCA# setup < t13a1 
8. t23b and t25d are only valid specifications when DEFOE# = Vcc. Otherwise, if DEFOE# = Vss, COEx# is never 
asserted during cache SRAM write cycles. If DEFOE# = Vsg, t23b and t25d are Not Applicable. 
9. t5 is measured from 0.8V to 3.7V. t6 is measured from 3.7V to 0.8V. ) 
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82385SX 
OUTPUT 


T° 


Figure 9-3. A.C. Test Load 
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290222-48 


Figure 9-2. CLK2, BCLK2 Timing 


386T™ SX Interface Parameters 
PHI1 PHI2 


ony aes ANUNAY (SO ANA 
| Mot NUNN: (ANNAN 
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eee pene hess 
LBAg AAT ANY A A NRA SE ANNAN 
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RESET wee A 
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OUTPUT DELAYS 
PHI2 


17 
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Cache Write Hit Cycle 
T2 


| PHI1 | PHI1 | PHIt 


@* . 218 — 
ae >| 21A 7) fr2tA ; 


oun YW oe 26 —| I) 


Bat MAX 


cs¢ 4 WT AM 


22A MAX 


CwEY a For TO, zum = Mb f 
CT/R# NN 


+24 MAX 
290222-52 


®*. This would be 21B if previous bus cycle was Cache Write Hit cycle. 


Cache Read Miss (Cache Update Cycle) 
TIP —T2P 


' PHI2 | PHI1 ' PHI2 | PHI 


21A <21A 
t 


VT SS 


CS# 


- cwEe# | , 228 : WY | , 
CT/R# | | re 


: 290222-53 
®*. This would be 21B if previous bus cycle was Cache Write Hit cycle. 
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Cache Read Cycle | 
TM, TIP T2,12P 
| PHT =! PHI2 
CLK2 = 
CALEN (T1) 
cS¥ 
CT/R¥ 


COE# 
(DIRECT MAPPED) . 


COE¥ 
(2WAY) 


-21A 


an i BANNAN et MMM, 


.©*, This would be 21B if previous bus cycle was Cache Write Hit cycle. 


System Bus Interface Parameters 
BPHI2 - BPHI1 BPHIZ BPHI1. 


SAI-SA25 ——— an es ET CE ANN 


sed 
BNAg — AAAAAX AAA 
Pf 


BREADY# — | = 
ssta¢ \\M\\YD =m: AAA 
| ae a = 
(waster conric.) SAKAAYD EE ANNAN 
——— a 
stave conri.) SAK\\Y) ae AAA 


290222-55 
@*. This would be 21B if previous bus cycle was Cache Write Hit cycle. 
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System Bus Interface Parameters (Continued) 
OUTPUT DELAYS 


BPHI2 BPHI1 BPHI2 
BCLK2 _ 
BCLK 
BADS#, BBE# 
BLOCK# 


MISS# I 7 


(VALID DELAY) 


BADS#, BBE# 4 
BLOCK# 


57 
MISS# M///K | 


FLOAT DELAY 


BHOLD v Vv 
(SLAVE CONFIG.) WILK | 


fot fs 


58 
BHLDA, WBS 
(MASTER CONFIG.) hh [7 at 


40 


41 
BACP, BAOE# 1) 


43A 435A 


BT/R#, DOEF _ DVZTTX 777K 


43C 


LosTB | 7) 
K/L 
VLLL/ 


BACP 


DOE# 
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82385SX Signal Summary 


Group/Name State Output _ Output? 
386 SX INTERFACE _ _ ae sf 


| CLK2 | 386 SX Clock | 
READYO# - Ready Output 


f 


L 

@) 
L 
L 
L 


BRDYEN # Bus Ready Enable 


READY! # | 


386 SX Ready Input > | 
ADS # 386 SX Address Status 
M/IO# 386 SX Memory / I/O Indication, 


O 
| O 
wie 
Die 
CACHE CONTROL 
CALEN [cache Aaeross Latch Enabie [vgn [Oe 
[corms | Cacho TiansmiReceve | _— | 0 | No 
Tcso.csi# | cachechipselects ‘(| tow | 0 | Ne 
coEa*, coc | cache OutputEnabies ‘| tow | 0 | No 
Tene | eee Skt acal Bue hocess «Tw 
[none | Nonacheebio Access | tow [1 | 


MISS # | Cache Miss Indication | Low 


WwW 

WwW 
ar 
“me 
Ww 

WwW 


Lo 
Lo 
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82385SX Signal Summary (Continued) _ 

: : Active Input/ Tri-State 
oe Output? 
82385SX Ready Input Low 
82385SX Next Address Request Low 


82385SX Lock Indication 


BADS# | 82385SX Address Status Low 


BBHE#,BBLE# | 82385SX Byte Enables 


DATA/ADDR CONTROL 


LDSTB Local Data Strobe 
DOE # Data Output Enable Low 

BT/R# Bus Transmit/Receive Sa 
BACP Bus Address Clock Pulse 
BAOE # Bus Address Output Enable 


CONFIGURATION 
2W/D# 

M/S# 

DEFOE # 
COHERENCY 
SA1-SA23 
SSTB# 


Signal 
Group/Name 


82385SX INTERFACE 
BREADY # 

BNA# 

BLOCK # 


@ 
” 


9) 
7) 


” 


Zz 
.e) 


2-Way/Direct Map Select 


Master/Slave Select Lee i 


Define Cache Output Enable 


Snoop Address Bus 


Snoop Strobe Low 


ARBITRATION | 


BHOLD Hold 0 
BHLDA Hold Acknowledge /0 
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HIGH PERFORMANCE 32-BIT DMA CONTROLLER WITH 
INTEGRATED SYSTEM SUPPORT PERIPHERALS 


@ High Performance 32-Bit DMA a Programmable Wait State Generator 
Controller —0to 15 Wait States Pipelined , 
— 50 MBytes/sec Maximum Data — 1 to 16 Wait States Non-Pipelined 


Transfer Rate at 25 MHz m DRAM Refresh Controller 
— 8 Independently Programmable 
Channels m 80386 Shutdown Detect and Reset 


20-S Interrupt Controll eo 
ee oe ee wooo: — Software/Hardware Reset 
— Individually Programmable Interrupt : : 
- Vectors - High Speed CHMOS Ill Technology | 


— 15 External, 5 Internal Interrupts __ m 132-Pin PGA Package | 
— 82C59A Superset Optimized for use with the 80386 


| 
m Four 16-Bit Programmable Interval . Microprocessor 
Timers -— Resides on Local Bus for Maximum 
— 82054 Compatible a Bus Bandwidth | 


The 82380 is a multi-function support peripheral that integrates system functions necessary in an 80386 
environment. It has eight channels of high performance 32-bit DMA with the most efficient transfer rates 
possible on the 80386 bus. System support peripherals integrated into the 82380 provide Interrupt Control, | 
Timers, Wait State generation, DRAM Refresh Control, and System Reset logic. 


The 82380’s DMA Controller can transfer data between devices of different data path widths using a single 
channel. Each DMA channel operates independently in any of several modes. Each channel has a temporary 
data storage register for handling non-aligned data without the need for external alignment logic. 


80386 LOCAL BUS 
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Pom a < 

| 4» 327 8IT 
8 ~ CHANNEL 

: DMA 

CONTROLLER 

Pe aeed 

| TIMER 0 

— meee’ TIMER 1 


CPU TIMER 2 
RESEE TIMER 3 


82380 Internal Block Diagram 
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20-LEVEL [o> 
INTERRUPT |" 


CONTROLLER 
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1.0 FUNCTIONAL OVERVIEW 


The 82380 contains several independent functional 
modules. The following is a brief discussion of the 
components and features of the 82380. Each mod- 
ule has a corresponding detailed section later in this 
data sheet. Those sections should be referred to for 
design and programming information. 


1.1 82380 Architecture 


The 82380 is comprised of several computer system 
functions that are normally found in separate LSI 
and VLSI components. These include: a high-per- 
formance, eight-channel, 32-bit Direct Memory Ac- 
cess Controller; a 20-level Programmable Interrupt 
Controller which is a superset of the 82C59A; four 
16-bit Programmable Interval Timers which are func- 
tionally equivalent to the 82C54 timers; a DRAM Re- 
fresh Controller; a Programmable Wait State Gener- 
ator; and system reset logic. The interface to the 
82380 is optimized for high-performance operation 
with the 80386 microprocessor. 


The 82380 operates directly on the 80386 bus. In 
the Slave mode, it monitors the state of the proces- 
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sor at all times and acts or idles according to the 
commands of the host. It monitors the address pipe- 
line status and generates the programmed number 
of wait states for the device being accessed. The 
82380 also has logic to reset the 80386 via hard- 
ware or software reset requests and processor shut- 
down status. 


After a system reset, the 82380 is in the Slave 
mode. It appears to the system as an I/O device. It 
becomes a bus master when it is performing DMA 
transfers. 


To maintain compatibility with existing software, the 
registers within the 82380 are accessed as bytes. If 
the internal logic of the 82380 requires a delay be- 
fore another access by the processor, wait states 
are automatically inserted into the access cycle. 
This allows the programmer to write initialization rou- 
tines, etc. without regard to hardware recovery 
times. 


Figure 1-1 shows the basic architectural compo- 
nents of the 82380. The following sections briefly 
discuss the architecture and function of each of the 
distinct sections of the 82380. 
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Figure 1-1. Architecture of the 82380 
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1.1.1 DMA CONTROLLER 


The 82380 contains a high-performance, 8-channel, 
32-bit DMA controller. It is capable of transferring 
any combination of bytes, words, and double words. 
The addresses of both source and distination can be 
independently incremented, decremented or held 
constant, and cover the entire 32-bit physical ad- 
dress space of the 80386. It can disassemble and 
assemble misaligned data via a 32-bit internal tem- 
porary data storage register. Data transferred be- 
tween devices of different data path widths can also 
be assembled and disassembled using the internal 
temporary data storage register. The DMA Controller 
can also transfer aligned: data between |/O and 
memory on the fly, allowing data transfer rates up to 
32 megabytes per second for an 82380 operating at 
16 MHz. Figure 1-2 illustrates the functional compo- 
nents of the DMA Controller. | 


There are twenty-four general status and command 
registers in the 82380 DMA Controller. Through 
these registers any of the channels may be pro- 
grammed into any of the possible modes. The oper- 
ating modes of any one channel are independent of 
the operation of the other channels. | 


Each channel has three programmable registers 
which determine the location and amount of data to 
be transferred: 


CONTROL/STATUS REGISTERS 
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Byte Count Register—Number of bytes to trans- 
fer. (24-bits) 


Requester Register—Address of memory. or pe- 
ripheral which is requesting DMA service. (32- 
bits) , , | | 


Target Register—Address of peripheral or mem- 
ory which will be accessed. (32-bits) 


There are also port addresses which, when ac- 
cessed, cause the 82380 to perform specific func- 
tions. The actual data written does not matter, the 
act of writing to the specific address causes the 
command to be executed. The commands which op- 
erate in this mode are: Master Clear, Clear Terminal 
Count Interrupt Request, Clear Mask Register, and 
Clear Byte Pointer Flip-Flop. : 


DMA transfers can be done between all combina- 
tions of memory and !/O; memory-to-memory, mem- 
ory-to-I/O, |/O-to-memory, and I/O-to-I/O. DMA 
service can be requested through software and/or 
hardware. Hardware DMA acknowledge signals are 
available for all channels (except channel 4) through 
an encoded 3-bit DMA acknowledge bus 
(EDACKO-2). 
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Figure 1-2. 82380 DMA Controller 
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The 82380 DMA controller transfers blocks of data 
(buffers) in three modes: Single Buffer, Buffer Auto- 
Initialize, and Buffer Chaining. In the Single Buffer 
Process, the 82380 DMA Controller is programmed 
to transfer one particular block of data. Successive 
transfers then require reprogramming of the DMA 
channel. Single Buffer transfers are useful in sys- 
tems where it is known at the time the transfer be- 
gins what quantity of data is to be transferred, and 
there is a contiguous block of data area available. — 


The Buffer Auto-Initialize Process allows the same 
data area to be used for successive DMA transfers 
without having to reprogram the channel. 


The Buffer Chaining Process allows a program to 
specify a list of buffer transfers to be executed. The 
82380 DMA Controller, through interrupt routines, is 
' reprogrammed from the list. The channel is repro- 
grammed for a new buffer before the current buffer 
transfer is complete. This pipelining of the channel 
programming process allows the system to allocate 
non-contiguous blocks of data storage space, and 
transfer all of the data with one DMA process. The 
buffers that make up the chain do not have to be in 
contiguous locations. 


Channel priority can be fixed or rotating. Fixed priori- 
ty allows the programmer to define the priority of 
DMA channels based on hardware or other fixed pa- 
rameters. Rotating priority is used to provide periph- 
erals access to the bus on a shared basis. 


With fixed priority, the programmer can set any 
channel to have the current lowest priority. This al- 
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lows the user to reset or manually rotate the priority 
schedule without reprogramming the command reg- 
isters. 


1.1.2 PROGRAMMABLE INTERVAL TIMERS 


Four 16-bit programmable interval timers reside 
within the 82380. These timers are identical in func- 
tion to the timers in the 82C54 Programmable Inter- 
val Timer. All four of the timers share a common 
clock input which can be independent of the system 
clock. The timers are capable of operating in six dif- 
ferent modes. In all of the modes, the current count 
can be latched and read by the 80386 at any time, 
making these very versatile event timers. Figure 1-3 
shows the functional components of the Program- 
mable Interval Timers. 


The outputs of the timers are directed to key system 
functions, making system design simpler. Timer 0 is 


routed directly to an interrupt input and is not avail- 


able externally. This timer would typically be used to 
generate time-keeping interrupts. 


Timers 1 and 2 have outputs which are available for 
general timer/counter purposes as well as special 
functions. Timer 1 is routed to the refresh control 
logic to provide refresh timing. Timer 2 is connected 
to an interrupt request input to provide other timer 
functions. Timer 3 is a general purpose timer/coun- 
ter whose output is available to external hardware. It 
is also connected internally to the interrupt request 
which defaults to the highest priority (IRQO). 
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Figure 1-3. Programmable Interval Timers—Block Diagram 
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1.1.3 INTERRUPT CONTROLLER 


The 82380 has the equivalent of three enhanced 
82C59A Programmable Interrupt Controllers. These 
controllers can all be operated in the Master mode, 


but the priority is always as if they were cascaded. . 


There are 15 interrupt request inputs provided for 
the user, all of which can be inputs from external 
slave interrupt controllers. Cascading 82C59As to 


these request inputs allows a possible total of 120 
external interrupt requests. Figure 1-4 is a block dia- . 


gram of the 82380 Interrupt Controller. 


Each of the interrupt request inputs can be individu- 
ally programmed with its own interrupt vector, allow- 
ing more flexibility in interrupt vector mapping than 
was available with the 82C59A. An interrupt is pro- 
vided to alert the system that an attempt is being 
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made to program the vectors in the method of the 
82C59A. This provides compatibility of existing soft- 
ware that used the 82C59A or 8259A with new de- 
signs using the 82380. 


In the event of an unrequested or otherwise errone- 
ous interrupt acknowledge cycle, the 82380 Interrupt 
Controller issues a default vector. This vector, pro- 
grammed by the system software, will alert the sys- 
tem of unsolicited interrupts of the 80386. 


The functions of the 82380 Interrupt Controller are. 
identical to the 82C59A, except in regards to pro- 
gramming the interrupt vectors as mentioned above. 
Interrupt request inputs are programmable as either 
edge or level triggered and are software maskable. 
Priority can be either fixed or rotating and interrupt 
requests can be nested. 
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Figure 1-4. 82380 Interrupt Controller—Block Diagram 
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Enhancements are added to the 82380 for cascad- 
ing external interrupt controllers. Master to Slave 
handshaking takes place on the data bus, instead of 
dedicated cascade lines. 


1.1.4 WAIT STATE GENERATOR 


The Wait State Generator is a programmable 
READY generation circuit for the 80386 bus. A pe- 
ripheral requiring wait states can request the Wait 
_ State Generator to hold the processor’s READY in- 
put inactive for a predetermined number of. bus 
states. Six different wait state counts can be pro- 
grammed into the Wait State Generator by software; 
three for memory accesses and three for I/O ac- 
cesses. A block diagram of the 82380 Wait State 
Generator is shown in Figure 1-5. - 


The peripheral being accessed selects the required 
wait state count by placing a code on a 2-bit wait 
state select bus. This code along with the M/lIO# 
signal from the bus master is used to select one of 
six internal 4-bit wait state registers which has been 
programmed with the desired number of wait states. 
From zero to fifteen wait states can be programmed 
into the wait state registers. The Wait State Genera- 
tor tracks the state of the processor or current bus 
master at all times, regardless of which device is the 
current bus master and regardless of whether or not 
the Wait State Generator is currently active. 


The 82380 Wait State Generator is disabled by mak- 
ing the select inputs both high. This allows hardware 
which is intelligent enough to generate its own ready 
signal to be accessed without penalty. As previously 


D7 
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mentioned, deselecting the Wait State Generator 
does not disable its ability to determine the proper 
number of wait states due to pipeline status in sub- 
sequent bus cycles. 


The number of wait states inserted into a pipelined 
bus cycle is the value in the selected wait state reg- 
ister. If the bus master is operating in the non-pipe- 
lined mode, the Wait State Generator will increase 
the number of wait states inserted into the bus cycle 
by one. 


On reset, the Wait State Generator’s registers are 
loaded with the value FFH, giving the maximum 
number of wait states for any access in which the 
wait state select inputs are active. 


1.1.5 DRAM REFRESH CONTROLLER 


The 82380 DRAM Refresh Controller consists of a 
24-bit refresh address counter and bus arbitration 
logic. The output of Timer 1 is used to periodically 
request a refresh cycle. When the controller re- 
ceives the request, it requests access to the system 
bus through the HOLD signal. When bus control is 
acknowledged by the processor or current bus mas- 


ter, the refresh controller executes a memory read 


operation at the address currently in the Refresh Ad- 
dress Register. At the same time, it activates a re- 
fresh signal (REF #) that the memory uses to force a 
refresh instead of a normal read. Control of the bus 
is transferred to the processor at the completion of 
this cycle. Typically a refresh cycle will take six clock 
cycles to execute on an 80386 bus. 
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Figure 1-5. 82380 Wait State Generator—Block Diagram 
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The 82380 DRAM Refresh Controller has the. high- 
est priority when requesting bus access and will in- 


terrupt any active DMA process. This allows large © 


blocks of data to be moved by the DMA controller 
without affecting the refresh function. Also the DMA 
controller is not required to completely relinquish the 
bus, the refresh controller simply steals a bus cycle 
between DMA accesses. 


The amount by which the refresh address is incre- 
mented is programmable to allow for different bus 
widths and memory bank arrangements. 


1.1.6 CPU RESET FUNCTION 


The 82380 contains a special reset function which 
can respond to hardware reset signals from the 
82384, as well as a software reset command. The 
circuit will hold the 80386’s RESET line active while 
an external hardware reset signal is present at its 
RESET input. It can also reset the 80386 processor 
as the result of a software command. The software 
reset command causes the 82380 to hold the proc- 
-essor’s RESET line active for a minimum of 62 CLK2 
cycles; enough time to allow an 80386 to re-initialize. 


The 82380 can be programmed to sense the shut- 


down detect code on the status lines from the 
80386. If the Shutdown Detect function is enabled, 
_ the 82380 will automatically reset the processor. A 
diagnostic register is available which can be used to 
determine the cause of reset. 


1.1.7 REGISTER MAP RELOCATION 


After a hardware reset, the internal registers of the 
82380 are located in |/O space beginning at port 
address O000H. The map of the 82380’s registers is 
relocatable via a software command. The default 
mapping places the 82380 between |/O addresses 
OOOOH and OODBH. The relocation register allows 
this map to be moved to any even 256-byte bounda- 
ry in the processor’s 16-bit |/O address space or any 


even 16-Mbyte boundary in the 32-bit memory ad- 


dress space. 


1.2 Host Interface 


The 82380 is designed to operate efficiently on the 
local bus of an 80386 microprocessor. The control 
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signals of the 82380 are identical in function to 


those of the 80386. As a slave, the 82380 operates 
with all of the features available on the 80386 bus. 
When the 82380 is in the Master mode, it looks iden- 
tical to the 80386 to the connected devices. 


The 82380 monitors the bus at all times, and deter- 
mines whether the current bus cycle is a pipelined or 
non-pipelined access. All of the status signals of the 
processor are monitored. tages | 


The control, status, and data registers within the 
82380. are located at fixed addresses relative to 
each other, but the group can be relocated to either 
memory or I/O space and to different locations with- 
in those spaces. 


As a Slave device, the 82380 monitors the control/ 
status lines of the CPU. The 82380 will generate all 
of the wait states it needs whenever it is accessed. 
This allows the programmer the freedom of access- 
ing 82380 registers without having to insert NOPs in 
the program to wait for slower 82380 internal regis- 
ters. | , " 


The 82380 can determine if a current bus cycle is a 
pipelined or a non-pipelined cycle. It does this by 
monitoring the ADS# and READY# signals and 
thereby keeping track of the current state of the 
80386. | | 


As a bus master, the 82380. looks like an 80386 to 
the rest of the system. This enables the designer 
greater flexibility in systems which include the 
82380. The designer does not have to alter the inter- 
faces of any peripherals designed to operate with 
the 80386 to accommodate the 82380. The 82380 
will access any peripherals on the bus in the same 
manner as the 80386, including recognizing pipe- 
lined bus cycles. 


The 82380 is accessed as an 8-bit peripheral. This is 


done to maintain compatibility with existing system 


architectures and software. The 80386 places the 
data of all 8-bit accesses either on D (0-7) or D (8- 
15). The 82380 will only accept data on these lines 
when in the Slave mode. When in the Master mode, 
the 82380 is a full 32-bit machine, sending and re- 
ceiving data in the same manner as the 80386. 
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1.3 IBM PC* System Compatibility 


The 82380 is an 80386 companion device designed 
to provide an enhancement of the system functions 
common to most small computer systems. It is mod- 
eled after and is a superset of the Intel peripheral 
products found in the IBM PC, PC-AT, and other 
popular small computers. | : 


2.0 80386 HOST INTERFACE 


The 82380 contains a set of interface signals to op- 
erate efficiently with the 80386 host processor. 
These signals were designed so that minimal hard- 


ware is needed to connect the 82380 to the 80386. 


Figure 2-1 depicts a typical system configuration 
with the 80386 processor. As shown in the diagram, 
the 82380 is designed to interface directly with the 
80386 bus. 


*IBM PC and IBM PC-AT are registered trademarks of Inter- 
national Business Machines Inc. 


82380 


Since the 82380 is residing on the opposite side of 
the data bus transceiver (with respect to the rest of 
the peripherals in the system), it is important to note 
that the transceiver should be controlled so that 
contention between the data bus transceiver and 
the 82380 will not occur. In order to do this, port 
address decoding logic should be included in the di- 
rection and enable control logic of the transceiver. 
When any of the 82380 internal registers is read, the 
data bus transceiver should be disabled so that only 
the 82380 will drive the local bus. 


This section describes the basic bus functions of the 
82380 to show how this device interacts with the 
80386 processor. Other signals which are not direct- 
ly related to the host interface will be discussed in 
their associated functional block description. 


FROM OTHER 
PERIPHERALS 


A2=-A31 


DO=-D31 


TO BUS 


Figure 2-1. 80386/82380 System Configuration 


BEO-3#, 
A2—A31 


DO=-D31 


TO BUS 
CONTROLLER BUFFERS 


290128-7 


5-1091 


intel 


2. 1 Master and Slave Modes 


Ms 


At any time, the 82380 acts as either a Slave device 
or a Master device in the system. Upon reset, the 
82380 will be in the Slave Mode. In this mode, the 
80386 processor can read/write into the 82380 in- 
ternal registers. Initialization information may be pro- 
eranimed into the 82380 during Slave Mode. 


When DMA service (including DRAM Refresh eyelids 
generated by the 82380) is requested, the 82380 will 
request and subsequently get control of the 80386 
local bus. This is done through the HOLD and HLDA 
(Hold Acknowledge) signals. When the 80386 proc- 
essor responds by asserting the HLDA signal, the 
82380 will switch into Master Mode and perform 
DMA transfers. In this mode, the 82380 is the bus 
master of the system. It can read/write data from/to 
memory and peripheral devices. The 82380 will re- 
turn to the Slave Mode upon completion of DMA 
transfers, or when HLDA is negated. 


2.2 80386 Interface Signals 


As mentioned in the Architecture section, the Bus 


Interface module of the 82380 (see Figure 1-1) con- 


tains signals that are directly connected to the | 


80386 host processor. This module has" separate 
32-bit Data and Address busses. Also, it has addi- 
tional control signals to support different bus opera- 
tions on the system. By residing on the 80386 local 
bus, the 82380 shares the same address, data and 


~ control lines with the processor. The following sub- 
sections discuss the signals which interface to the — 


80386 host processor. 


82380 CLOCK PERIOD 
CLK2 PERIOD 


82380 CLOCK PERIOD ~ 
CLK2 PERIOD 


82380 


2.2.1 CLOCK (CLK2) — 


The CLK2 input provides fundamental timing for the 
82380. It is divided by two internally to generate the 
82380 ‘internal clock. Therefore, CLK2 should be 
driven with twice the 80386’s frequency. In order to 
maintain synchronization with the 80386 host. proc- 
essor, the 82380 and the 80386 should share a 
common clock source. 


The internal clock consists of two phases: PHI1 and 
PHI2. Each CLK2 period is a phase of the internal 
clock. PHI2 is usually used to sample input and set 
up internal signals and PHI1 is for latching internal 
data. Figure 2-2 illustrates the relationship of CLK2 
and the 82380 internal clock signals. The CPURST 
signal generated by the 82380 guarantees that the 
80386 will wake up in phase with PHI1. 


2.2.2 DATA BUS (D0-D31) 


This 32-bit three-state bidirectional bus provides a 
general purpose data path between the 82380 and 


_ the system. These pins are tied directly to the corre- 


sponding Data Bus pins of the 80386 local bus. The 
Data Bus is also used for interrupt vectors generated 
by the 82380 in the Interrupt Acknowledge cycle. 


During Slave 1/O operations, the 82380 expects a 


single byte to be written or read. When the 80386 
host processor writes into the 82380, either DO-D7 
or D8-D15 will be latched into the 82380, depend- | 
ing upon how the Byte Enable (BEO#-BE#3) sig- 
nals are driven. The 82380 does not need to look at 


- D16-D31 since the 80386 duplicates the single byte 


82380 CLOCK PERIOD 
CLK2 PERIOD 
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Figure 2-2. CLK2 and 82380 Internal Clock 
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data on both halves of the bus. When the 80386 
host processor reads from the 82380, the single 
byte data will be duplicated four times on the Data 
Bus; i.e., on DO-D7, D8B-—D15, D16-—D23 and D24- 
D31. 


During Master Mode, the 82380 can transfer 32-, 16-, 
and 8-bit data between memory (or |/O devices) and 
\/O devices (or memory) via the Data Bus. 


2.2.3 ADDRESS BUS (A31-A2) 


These three-state bidirectional signals are connect- 
ed directly to the 80386 Address Bus. In the Slave 
Mode, they are used as input signals so that the 
processor can address the 82380 internal ports/reg- 
isters. In the Master Mode, they are used as output 
signals by the 82380 to address memory and periph- 
eral devices. The Address Bus is capable of ad- 
dressing 4 G-bytes of physical memory space 
(OOOOO000H to FFFFFFFFH), and 64 K-bytes of I/O 
addresses (OQOQQOQOQ00H to OOOOFFFFH). 


82380 


2.2.4 BYTE ENABLE (BE3# -BE0#) 


These bidirectional pins select specific byte(s) in the 
double word addressed by A31-A2. Similar to the 
Address Bus function, these signals are used as in- 
puts to address internal 82380 registers during 
Slave Mode operation. During Master Mode opera- 
tion, they are used as outputs by the 82380 to ad- 
dress memory and I/O locations. 


NOTE: 


In addition to the above function, BE3# is used 
to enable a production test mode and must be 
LOW during reset. The 80386 processor will au- 
tomatically hold BE3# LOW during RESET. | 


The definitions of the Byte Enable signals depend 
upon whether the 82380 is in the Master or Slave 
Mode. These definitions are depicted in Table 2-1. 


Table 2-1. Byte Enable Signals 


As INPUTS (Slave Mode): 


BE3 #-BEO# 


01 
10 
11 


X-DON’T CARE 


; Data Bits Written 
Implied A1, AO to 82380" 


00 


DO-D7 
D8-D15 
DO-D7 

D8-D15 


*During READ, data will be duplicated on DO-D7, D8-D15, D16-—D23, and D24-D31. 
During WRITE, the 80386 host processor duplicates data on DO—D15, and D16—D31, so that the 82380 
is concerned only with the lower half of the Data Bus. 


As OUTPUTS (Master Mode): 


Byte to be Accessed 


BES #-BE0# | pelative to A31-A2 
0 
1 
2 
3 
1 
0 
2 
0 
1 
0 

U = Undefined 

A = Logical DO-D7 

B = Logical D8-D15 


C = Logical D16-D23 
D = Logical D24-D31 


Logical Byte Presented On 
Data Bus During WRITE Only* 
D24-31 D16-23 D8-15 D0-7. 


OWONrcuocrccec 
Orwnwwnoworrcrc 
PrprrrrrrrYrp,p 


U 
U 
U 
A 
U 
U 
B 
U 
C 
D 


*Actual number of bytes accessed depends upon the programmed data path width. 
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2.2.5 BUS CYCLE DEFINITION SIGNALS (D/C#, 
W/R#, M/10 #) 


These three-state bidirectional signals define the | 


type of bus cycle being performed. W/R#¥ distin- 
guishes between write and read cycles. D/C# dis- 
tinguishes between processor data and control cy- 


cles. M/IO# distinguishes between memory and 1/0 


cycles. 


During Slave Mode, these signals are driven by the 


80386 host processor; during Master Mode, they are 


driven by the 82380. In either mode, these signals 
will be valid when the Address Status (ADS#) is 
driven LOW. Exact bus cycle definitions are given in 
Table 2-2. Note that some combinations are recog- 
nized as inputs, but not generated as outputs. In the 
Master Mode, D/C# is always HIGH. 


2.2.6 ADDRESS STATUS (ADS#) 


- This bidirectional signal indicates that a valid ad- 
dress (A2—A31, BEO # -BE3#) and bus cycle defini- 
tion (W/R#, D/C#, M/IO#) is being driven on the 
bus. In the Master Mode, it is driven by the 82380 as 
an output. In the Slave Mode, this signal is moni- 
tored as an input by the 82380. By the current and 
past status of ADS# and the READY # input, the 
82380 is able to determine, during Slave Mode, if the 
next bus cycle is a pipelined address cycle. ADS # is 
asserted during. T1 and T2P bus states (see Bus 
State Definition). 


Note that during the idle states at the beginning and 


the end of a DMA process, neither the 80386 nor the 
82380 is driving the ADS# signal; i.e., the signal is 
left floated. Therefore, it is important to use a pull-up 
resistor (approximately 10 KQ) on the ADS # signal. 


2.2.7 TRANSFER ACKNOWLEDGE (READY #) 


This input indicates that the current bus cycle is 
complete. In the Master Mode, assertion of this sig- 


82380 


nal indicates the end of a DMA bus cycle. In the 
Slave Mode, the 82380 monitors this input and 
ADS # to detect a pipelined address cycles. This sig- 
nal should be tied directly to the READY # input of 
the 80386 host processor. 


2.2.8 NEXT ADDRESS REQUEST (NA#) 


This input is used to indicate to the 82380 in the 


Master Mode that the system is requesting address 
pipelining. When driven LOW by either memory. or 
peripheral devices during Master Mode, it indicates 
that the system is prepared to accept a new address 
and bus cycle definition signals from the 82380 be- 
fore the end of the current .bus cycle. If this input is 
active when sampled by the 82380, the next address 
is driven onto the bus, provided a or mendes! is 
already pending internally. 


This input pin is monitored aniy in the Master Mode. 
In the Slave Mode, the 82380 uses the ADS# and 
READY # signals to determine address pipelining 
cycles, and NA# will be ignored. 


2.2.9 RESET (RESET, CPURST) 
RESET 


This synchronous input suspends any operation in 
progress and places the 82380 in a known initial 
state. Upon reset, the 82380 will be in the Slave 
Mode waiting to be initialized by the 80386 host 
processor. The 82380 is reset by asserting RESET 
for 15 or more CLK2 periods. When RESET is as- 
serted, all other input pins are ignored, and all other 
bus pins are driven to an idle bus state as shown in 
Table 2-3. The 82380 will determine the phase of its 
internal clock following RESET going inactive. | 


Table 2-2. Bus Cycle Definition 


| M/IO# As INPUTS . As OUTPUTS 


_ Interrupt 
Acknowledge 
UNDEFINED 

I/O Read 

I/O Write 
UNDEFINED 

HALT if 

BE(3-—0) # = X011 


NOT GENERATED 


NOT GENERATED 
1/O Read 

I/O Write 

NOT GENERATED 
NOT GENERATED 


SHUTDOWN if 
BE (8-—0)# = XXX0 


Memory Read 
Memory Write 


Memory Read 
Memory Write 
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Table 2-3. Output Signals Following RESET 


A2-A31, DO-D31, BEO# -BE3 # 
D/C#, W/R#, M/IO#, ADS # 
READYO # 

EOP # 


EDACK2—EDACKO 
HOLD 

INT 
TOUT1/REF #, TOUT2#/IRQ3#, TOUT3# 
CPURST 


Float 

Float 

4? 

‘1’ (Weak Pull-UP) 
‘100’ 

‘0? 

UNDEFINED* 
UNDEFINED* 

‘0? 


*The Interrupt Controller and Programmable Interval Timer are initialized by software commands. 


RESET is level-sensitive and must be synchronous 
to the CLK2 signal. Therefore, this RESET input 
should be tied to the RESET output of the Clock 


Generator. The RESET setup and hold time require- - 


ments are shown in Figure 2.3. 
CPURST 


This output signal is used to reset the 80386 host 
processor. It will go active (HIGH) whenever one of 
the following events occurs: a) 82380’s RESET input 
is active; b) a software RESET command is issued 
to the 82380; or c) when the 82380 detects a proc- 
essor Shutdown cycle and when this detection fea- 
ture is enabled (see CPU Reset and Shutdown De- 
tect). When activated, CPURST will be held active 
for 62 CLK2 periods. The timing of CPURST is such 
that the 80386 processor will be in synchronization 
with the 82380. This timing is shown in Figure 2-4. 


T30 T31 


2.2.10 INTERRUPT OUT (INT) 


This output pin is used to signal the 80386 host 
processor that one or more interrupt requests (either 
internal or external) are pending. The processor is 
expected to respond with an Interrupt Acknowledge 
cycle. This signal should be connected directly to 
the Maskable Interrupt Request (INTR) input of the 
80386 host processor. 


2.3 82380 Bus Timing 


The 82380 internally divides the CLK2 signal by two 
to generate its internal clock. Figure 2-2 shows the 
relationship of CLK2 and the internal clock. The in- 
ternal clock consists of two phases: PHI1 and PHI2. 
Each CLK2 period is a phase of the internal clock. In 
Figure 2-2, both PHI1 and PHI2 of the 82380 internal 


clock are shown. 


| PHi 2 | PHI 1 | PHI 2 


RESET \ | | : 
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T30-RESET Hold Time 
T31-RESET Setup Time 


Figure 2-3. RESET Timing 


K—T33 MIN. 


CPURST 


T33-CPU Reset from CLK2 


Figure 2-4. CPURST Timing 
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82380 CLOCK PERIOD 
CLK2 PERIOD - 


82380 CLOCK PERIOD 
CLK2 PERIOD 


82380 


82380 CLOCK PERIOD 
CLK2 PERIOD 
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Figure 2-2. CLK2 and 82380 Internal Clock 


In the 82380, whether it is in the Master or Slave 
Mode, the shortest time unit of bus activity is a bus 
state. A bus state, which is also referred as a 
‘T-state’, is defined as one 82380 PHI2 clock period 
(i.e., two CLK2 periods). Recall in Table 2-2, there 
are six different types of bus cycles in the 82380 as 
defined by the M/IO#, D/C# and W/R# signals. 
Each of these bus cycles is composed of two or 
more bus states. The length of a bus cycle depends 
on when the READY # input is asserted (i.e., driven 
LOW). | ‘ i 


2.3.1 ADDRESS PIPELINING 


The 82380 supports Address Pipelining as an option 
in both the Master and Slave Mode. This feature typ- 


ically allows a memory or peripheral device to oper- 


ate with one less wait state than would otherwise be 
required. This is possible because during a pipelined 
cycle, the address and bus cycle definition of the 
next cycle will be generated by the bus master while 
waiting for the end of the current cycle to be ac- 
knowledged. The pipelined bus is especially well 
suited for interleaved memory environment. For 16 
MHz interleaved: memory designs with 100 ns ac- 
cess time DRAMs, zero wait state memory accesses 
can be achieved when pipelined addressing is se- 
lected. 


In the Master Mode, the 82380 is capable of initiat- 
ing, on a cycle-by-cycle basis, either a pipelined or 
non-pipelined access depending upon the state of 
the NA # input. If a pipelined cycle is requested (indi- 
cated by NA# being driven LOW), the 82380 will 


drive the address and bus cycle definition of the next 
cycle as soon as there is an internal bus request 
pending. as es 


In the Slave Mode, the 82380 is constantly monitor- 


_ing the ADS# and READY # signals on the proces- 


sor local bus to determine if the current bus cycle is 
a pipelined cycle. If a pipelined cycle is detected, the 
82380 will request one less wait state from the proc- 
essor if the Wait State Generator feature is selected. 
On the other hand, during an 82380 internal register 
access in a pipelined cycle, it will make use of the 
advance address and bus cycle information. In all 
cases, Address Pipelining will result in a savings of 
one wait state. 


2.3.2 MASTER MODE BUS TIMING 


When the 82380 is in the Master Mode, it will be in 
one of six bus states. Figure 2-5 shows the complete 
bus state diagram of the Master Mode, including 
pipelined address states. As seen in the figure, the 
82380 state diagram is very similar to that of the 
80386. The major difference is that in the 82380, 
there is no Hold state. Also, in the 82380, the condi- 
tions for some state transitions depend upon wheth- 


er it is the end of a DMA process*. 


NOTE: 
*The term ‘end of a DMA process’ is loosely de- 
fined here. It depends on the DMA modes of oper- 
ation as well as the state of the EOP# and DREQ 
inputs. This is explained in detail in section 3—DMA 
Controller. 


5-1096 


intel 


The 82380 will enter the idle state, Ti, upon RESET 
and whenever the internal address is not available at 
the end of a DMA cycle or at the end of a DMA 
process. When address pipelining is not used (NA# 
is not asserted), a new bus cycle always begins with 
state T1. During T1, address and bus cycle definition 
signals will be driven on the bus. T1 is always fol- 
lowed by T2. | 


lf a bus cycle is not acknowledged (with READY #) 
during T2 and NA# is negated, T2 will be repeated. 
When the end of the bus cycle is acknowledged dur- 
ing T2, the following state will be T1 of the next bus 
cycle (if the internal address latch is loaded and if 
this is not the end of the DMA process). Otherwise, 
the Ti state will be entered. Therefore, if the memory 
or peripheral accessed is fast enough to respond 


within the first T2, the fastest non-pipelined cycle will 


take one T1 and one T2 state. 


READY# Asserted. [Not ADAV + End of DMA] 


82380 


Use of the address pipelining feature allows the 
82380 to enter three additional bus states: T1P, 
T2P, and T2i. T1P is the first bus state of a pipelined 
bus cycle. T2P follows T1P (or T2) if NA# is assert- 
ed when sampled. The 82380 will drive the bus with 
the address and bus cycle definition signals of the 
next cycle during T2P. From the state diagram, it can 
be seen that after an idle state Ti, the first bus cycle 
must begin with T1, and is therefore a non-pipelined 
bus cycle. The next bus cycle can be pipelined if 
NA# is asserted and the previous bus cycle ended 
in a T2P state. Once the 82380 is in a pipelined 
cycle and provided that NA# is asserted in subse- 
quent cycles, the 82380 will be switching between 
T1P and T2P states. if the end of the current bus 
cycle is not acknowledged by the READY # input, 
the 82380 will extend the cycle by adding T2P 
states. The fastest pipelined cycle will consist of one 
T1P and one T2P state. 


NA# Negated 


READY# ‘Negated. 


ADAV. 


READY# Asserted. 
Not End of DMA 


READY# Negated. 
NA# Asserted. 


Not ADAV 


NA# Negated 


READY# Asserted 
Not End of DMA 


READY# Negated. 
NA# Asserted. 


READY# Negoted 
Not End of DMA 


ADAV. READY# Negated 


[End of DMA+ Not ADAV | 


ADAV. READY# Asserted 


READY# Asserted. [Not ADAV + End of DMA] 


NA# Asserted. [Not ADAV + End of DMA] 


Not ADAV. READY# Negated 


NOTE: 
ADAV—Internal Address Available 
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Figure 2-5. Master Mode State Diagram 
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The 82380 will enter state T2i when NA# is assert- 
ed and when one of the following two conditions 
occurs. The first condition is when the 82380 is in 
state T2. T2i will be entered if READY # is not as- 
serted and there is no. next address available: This 
situation is similar to a wait state. The 82380 will stay 
in T2i for as long as this condition exists. The sec- 
ond condition which will cause the 82380 enter T2i is 
when the 82380 is in state T1P. Before going to 


82380 


state T2P, the 82380 needs to wait in state T2i until 
the next address is available. Also, in both cases, if 


the DMA process is complete, the 82380 will enter 


the T2i state in order to finish the current DMA cycle. 


Figure 2-6 is a timing diagram showing non-pipelined 
bus accesses in the Master Mode. Figure 2-7 shows 
the timing of pipelined accesses in the Master Mode. 


aps¢ ~ \ / \ / \ _ 


AND CONTROL 5 
DATA C=) {__) C. 
(READ) 


On OD CD [is 
(WRITE) 


NAS KKK KIRK KRY 


READY# XXXXXXXXXXXXXAKA 
| OWAIT STATE =| 


AXXXXXXXXXY vxXxX\ 


VOOKOXXXXAAAAAAY 


AOA 
1 WAIT STATE O WAIT STATE 
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Figure 2-6. Non-Pipelined Bus Cycles 


ADDRESS 


AND CONTROL 


NAF XOOOA AXXO A __ AORN AXA 


READY# XX AXXXXXXXXXXA ROO UXXXA __AXXXXXXXXXD 
DATA | \ 

(READ) Cc -_ (> iat 
i a Cae aaa, Gmumnaet 

(WRITE) 


Figure 2-7. Pipelined Bus Cycles 
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2.3.3 SLAVE MODE BUS TIMING 


Figure 2-8 shows the Slave Mode bus timing in both 
pipelined and non-pipelined cycles when the 82380 
is being accessed. Recall that during Slave Mode, 
the 82380 will constantly monitor the ADS# and 
READY # signals to determine if the next cycle is 
pipelined. In Figure 2-8, the first cycle is non-pipe- 
lined and the second cycle is pipelined. In the pipe- 
lined cycle, the 82380 will start decoding the ad- 


NON=PIPELINED 
CYCLE 


A(2=31) 
Pion an 
D/C#, W/R# 


READYO# 


(TWO OR MORE WAIT STATES) 


READY# 


D(0-31) 
(READ) 


p(0-15) 
(WRITE) 


NOTE: 


— 82380 


dress and bus cycle signals one bus state earlier 
than in a non-pipelined cycle. 


The READY # input signal is sampled by the 80386 
host processor to determine the completion of a bus 
cycle. This occurs during the end of every T2 and 
T2P state. Normally, the output of the 82380 Wait 
State Generator, READYO#, is directly connected 
to the READY # input of the 80386 host processor 
and the 82380. In such case, READYO# and 
READY # will be identical (see Wait State Genera- 
tor). 


PIPELINED 
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NA# is shown here only for timing reference. It is not sampled by the 82380 during Slave Mode. 
When the 82380 registers are accessed, it will take one or more wait states in pipelined and two or more wait states in 


non-pipelined cycle to complete the internal access. 


Figure 2-8. Slave Read/Write Timing 
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3.0 DMA Controller 


The 82380 DMA Controller is capable of transferring 
data between any combination of memory and/or 
1/O, with any combination (8-, 16-, or 32-bits) of data 
path widths. Bus bandwidth is optimized through the 
use of an internal temporary register which can dis- 
assemble or assemble data to or from either an 
aligned or a non-aligned destination or source. Fig- 


CONTROL/STATUS REGISTERS 
COMMAND REGISTER I 


82380 


ure 3-1 is a block diagram of the 82380 DMA Con- 
troller. | 


The 82380 has eight channels of DMA. Each chan- 
nel operates independently of the others. Within the 
operation of the individual channels, there are many 
different modes of data transfer available. Many of 
the operating modes can be intermixed to provide a 
very versatile DMA controller. ere 


CHANNEL REGISTERS 
BASE 


[COMMAND REGISTERT | CURRENT | TEMPORARY 
COMMAND REGISTER 11. | BYTE COUNT | BYTE COUNT] REGISTER 


MODE REGISTER I BASE CURRENT 
REQUESTER } REQUESTER 
MODE REGISTER I ADDRESS 


SOFTWARE REQUEST 
REGISTER 


ADDRESS 


CURRENT 
TARGET 
ADDRESS | 


oO, CHANNEL O 


BASE 
TARGET 


STATUS REGISTER CHANNEL 1 (SAME AS CH 0) 
BUS SIZE REGISTER CHANNEL 2 (SAME AS CH 0) 


CHAINING REGISTER CHANNEL 3 (SAME AS CH 0) 


| a "LOWER" GROUP OF CHANNELS _ 


| - "UPPER" GROUP OF CHANNELS 


EDACKO 


EDACK1 PROCESS 


ef CONTROL 
EDACK2 


EOP# 


CONTROL/STATUS 


(SAME AS 
LOWER GROUP) 


. | MASK REGISTER ADDRESS 


7 Figure 3-1. 82380 DMA Controller Block Diagram 


CHANNEL 4 (SAME AS CH 0) 
CHANNEL 5 (SAME AS CH 0) 
CHANNEL 6 (SAME AS CH 0) __ 


CHANNEL 7 (SAME AS CH 0) 
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3.1. Functional Description 


In describing the operation of the 82380’s DMA Con- 
troller, close attention to terminology is required. Be- 
fore entering the discussion of the function of the 
82380 DMA Controller, the following explanations of 
some of the terminology used herein may be of ben- 
efit. First, a few terms for clarification: 


DMA PROCESS—A DMA process is the execution 
of a programmed DMA task from beginning to end. 
Each DMA process requires initial programming by 
the host 80386 microprocessor. 


BUFFER—A contiguous block of data. 


BUFFER TRANSFER—The action required by the 
DMA to transfer an entire buffer. 


DATA TRANSFER—The DMA action in which a 
group of bytes, words, or double words are moved 
between devices by the DMA Controller. A data 
transfer operation may involve movement of one or 
many bytes. 


BUS CYCLE—Access by the DMA to a single byte, 
word, or double word. 


Each DMA channel consists of three major compo- 
nents. These components are identified by the con- 
tents of programmable registers which define the 
memory or |/O devices being serviced by the DMA. 
They are the Target, the Requester, and the Byte 
Count. They will be defined generically here and in 
greater detail in the DMA register definition section. 


The Requester is the device which requires service 
by the 82380 DMA Controller, and makes the re- 
quest for service. All of the control signals which the 
DMA monitors or generates for specific channels 
are logically related to the Requester. Only the Re- 
quester is considered capable of initiating or termi- 
nating a DMA process. 


The Target is the device with which the Requester 
wishes to communicate. As far as the DMA process 
is concerned, the Target is a slave which is incapa- 
ble of control over the process. 


The direction of data transfer can be either from Re- 
quester to Target or from Target to Requester; i.e., 
each can be either a source or a destination. 


The Requester and Target may each be either I/O 
or memory. Each has an address associated with it 
that can be incremented, decremented, or held con- 
stant. The addresses are stored in the Requester 
Address Registers and Target Address Registers, 
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respectively. These registers have two parts: one 
which contains the current address being used in the 
DMA process (Current Address Register), and one 
which holds the programmed base address (Base 
Address Register). The contents of the Base Regis- 
ters are never changed by the 82380 DMA Control- 
ler. The Current Registers are incremented or decre- 
mented according to the progress of the DMA pro- 
cess. | 


The Byte Count is the component of the DMA pro- 
cess which dictates the amount of data which must 
be transferred. Current and Base Byte Count Regis- 
ters are provided. The Current Byte Count Register 
is decremented once for each byte transferred by 
the DMA process. When the register is decremented 
past zero, the Byte Count is considered ‘expired’ 
and the process is terminated or restarted, depend- 
ing on the mode of operation of the channel. The 
point at which the Byte Count expires is called ‘Ter- 
minal Count’ and several status signals are depen- 
dent on this event. 


Each channel of the 82380 DMA Controller also 
contains a 32-bit Temporary Register for use in as- 
sembling and disassembling non-aligned data. The 
operation of this register is transparent to the user, 
although the contents of it may. affect the timing of 
some DMA handshake sequences. Since there is 
data storage available for each channel, the DMA 
Controller can be interrupted without loss of data. 


The 82380 DMA Controller is a slave on the bus until 
a request for DMA service is received via either a 
software request command or a hardware request 
signal. The host processor may access any of the 
control/status or channel registers at any time the 
82380 is a bus slave. Figure 3-2 shows the flow of 
operations that the DMA Controller performs. 


At the time a DMA service request is received, the 
DMA Controller issues a bus hold request to the 
host processor. The 82380 becomes the bus master 
when the host relinquishes the bus by asserting a 
hold acknowledge signal. The channel to be serv- 
iced will be the one with the highest priority at the 
time the DMA Controller becomes the bus master. 
The DMA Controller will remain in control of the bus 
until the hold acknowledge signal is removed, or un- 
til the current DMA transfer is complete. 


While the 82380 DMA Controller has control of the 
bus, it will perform the required data transfer(s). The 
type of transfer, source and destination addresses, 
and amount of data to transfer are programmed in 
the control registers of the DMA channel which re- 
ceived the request for service. 
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At completion of the DMA process, the 82380 will 
WAIT FOR DMA | remove the bus hold request. At this time the 82380 
SERVICE REQUEST [a ]No request becomes a slave again, and the host returns to be- 
REQUEST PENDING ing a master. If there are other DMA channels with 

requests pending, the controller will again assert the 


hold request signal and restart the bus arbitration 
| __REQ 7 and switching process. | 


BUS HOLD ACKNOWLEDGED 


ARBITRATE ; | 
_ 3.2 Interface Signals 


There are fourteen control signals dedicated to the 


| - DMA process. They include eight DMA Channel Re- 
_PRIORITY TRANSFER - quests (DREQn), three Encoded DMA Acknowledge 

| signals (EDACKn), Processor Hold and Hold Ac- 

DEeaee oe Be ‘ knowledge (HOLD, HLDA), and End-Of-Process 


puts are handshake signals to the devices requiring 
DMA service. The HOLD output and HLDA input are 


290128-17 handshake signals to the host processor. Figure 3-3 
ae? 7 ; shows these signals and how they interconnect be- 
Figure 3-2. Flow of DMA Controller Operation tween the 82380 DMA Controller, and the Requester 


and Target devices. 


BUS CONTROL 
W/R# M/lO# o/c SIGNALS 


i 
ACE Se 


END OF PROCESS - 


TO HOST 
PROCESSOR 
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Figure 3-3. Requester, Target, and DMA Controller Interconnection 
(2-Cycle Configuration) 
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3.2.1 DREQn and EDACK(0-2) 


These signals are the handshake signals between 
the peripheral and the 82380. When the peripheral 
requires DMA service, it asserts the DREQn signal 
of the channel which is programmed to perform the 
service. The 82380 arbitrates the DREQn against 
other pending requests and begins the DMA pro- 
cess after finishing other higher priority processes. 


When the DMA service for the requested channel is 
in progress, the EDACK(0-2) signals represent the 
DMA channel which is accessing the Requester. 
The 3-bit code on the EDACK(0—-2) lines indicates 
the number of the channel presently being serviced. 
Table 3-2 shows the encoding of these signals. Note 
that Channel 4 does not have a corresponding hard- 
ware acknowledge. 


The DMA acknowledge (EDACK) signals indicate 
the active channel only during DMA accesses to the 
Requester. During accesses to the Target, 
EDACK(0—2) has the idle code (100). EDACK(0-2) 
can thus be used to select a Requester device dur- 
ing a transfer. 


Table 3-2. EDACK Encoding During 
a DMA Transfer 


[EDACKz | EDACK1 | EDACKO | Active Channel 
0 


“a“-444COCdO 
-_~ O- O + Oo — © 


DREQn can be programmed as either an Asynchro- 
nous or Synchronous input. See section 3.4.1 for de- 
tails on synchronous versus asynchronous operation 
of this pin. : | | 


The EDACKn signals are always active. They either 
indicate ‘no acknowledge’ or they indicate a bus ac- 
cess to the requester. The acknowledge code is ei- 
ther 100, for an idle DMA or during a DMA access to 
the Target, or ‘n’ during a Requester access, where 
n is the binary value representing the channel. A 
simple 3-line to 8-line decoder can be used to pro- 
vide discrete acknowledge signals for the peripher- 
als. ’ 


3.2.2 HOLD and HLDA 


The Hold Request (HOLD) and Hold Acknowledge 
(HLDA) signals are the handshake signals between 
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the DMA Controller and the host processor. HOLD is 
an output from the 82380 and HLDA is an input. 
HOLD is asserted by the DMA Controller when there 
is a pending DMA request, thus requesting the proc- 
essor to give up control of the bus so the DMA pro- 
cess can take place. The 80386 responds by assert- 
ing HLDA when it is ready to relinquish control of the 
bus. 


The 82380 will begin operations on the bus one 
clock cycle after the HLDA signal goes active. For 
this reason, other devices on the bus should be in 
the slave mode when HLDA is active. 


HOLD and HLDA should not be used to gate or se- 
lect peripherals requesting DMA service. This is be- 
cause of the use of DMA-like operations by the 
DRAM Refresh Controller. The Refresh Controller is 
arbitrated with the DMA Controller for control of the 
bus, and refresh cycles have the highest priority. A 
refresh cycle will take place between DMA cycles 
without relinquishing bus control. See section 3.4.3 
for a more detailed discussion of the interaction be- 
tween the DMA Controller and the DRAM Refresh 
Controller. . 


3.2.3 EOP # 


EOP # is a bi-directional signal used to indicate the 
end of a DMA process. The 82380 activates this as 
an output during the T2 states of the last Requester 
bus cycle for which a channel is programmed to exe- 
cute. The Requester should respond by either with- 
drawing its DMA request, or interrupting the host 
processor to indicate that the channel needs to be 
programmed with a new buffer. As an input, this sig- 
nal is used to tell the DMA Controller that the periph- 
eral being serviced does not require any more data 
to be transferred. This indicates that the current 
buffer is to be terminated. 


EOP# can be programmed as either an Asynchro- 
nous or a Synchronous input. See section 3.4.1 for 
details on synchronous versus asynchronous opera- 
tion of this pin. 


3.3 Modes of Operation 


The 82380 DMA Controller has many independent 
operating functions. When designing peripheral in-- 
terfaces for the 82380 DMA Controller, all of the 
functions or modes must be considered. All of the 
channels are independent of each other (except in 
priority of operation) and can operate in any of the 
modes. Many of the operating modes, though inde- 
pendently programmable, affect the operation of 
other modes. Because of the large number of com- 
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binations possible, each programmable mode is dis- 
cussed here with its affects on the operation of other 
modes. The entire list of possible comminanons will 
not be presented. 


Table 3-1 shows the categories of DMA features 
available in the 82380. Each of the five major cate- 
gories is independent of the others. The sub-catego- 


ries are the available modes within the major func- 


tion or mode category. The following sections 
explain each mode or function and its relation to oth- 
er features. 


Table 3-1. DMA Operating Modes 


|. Target/Requester Definition 
a. Data Transfer Direction 
b. Device Type | 
c. Increment/Decrement/Hold 
ll. Buffer Processes : 
a. Single Buffer Process 
_ b. Buffer Auto-Initialize Process 
c. Buffer Chaining Process 
lll. Data Transfer/Handshake Modes 
a. Single Transfer Mode 
_ b. Demand Transfer Mode 
c. Block Transfer Mode 
d. Cascade Mode > 
IV. Priority Arbitration 
a. Fixed . 
b. Rotating 
c. Programmable Fixed | 
V. Bus Operation. 
a. Fly-By (Single-Cycte)/Two-Cycte 
b. Data Path Width 
. c. Read, Write, or Verify Cycles 


3.3.1 TARGET/REQUESTER DEFINITION 


All DMA transfers involve three devices: the DMA 
Controller, the Requester, and the Target. Since the 
devices to be accessed by the DMA Controller vary 
widely, the operating characteristics of the DMA 
Controller must be tailored to the Requester and 
Target devices. | 


The Requester can be defined as either the source 
or the destination of the data to be transferred. This 
is done by specifying a Write or a Read transfer, 
respectively. In a Read transfer, the Target is the 
data source and the Requester is the destination for 


the data. In a Write transfer, the Requester is the 
source and the Target in the destination. 


The Requester and Target addresses can each be 
independently programmed to be incremented, dec- 
remented, or held constant. As an example, the 
82380 is capable of reversing a string or data by 
having a Requester address increment and the Tar- © 
get address decrement in a memory-to-memory 
transfer. 


3.3.2 BUFFER TRANSFER PROCESSES 


The 82380 DMA Controller allows three programma- 
ble Buffer Transfer Processes. These processes de- 
fine the logical way in which a buffer of data is ac- 
cessed by the DMA. 


The three Buffer Transfer Processes include the Sin- 
gle Buffer Process, the Buffer Auto-Initialize Pro- 
cess, and the Buffer Chaining Process. These pro- 
cesses require special programming considerations. 
See the DMA Programming section for more details 
on setting up the Buffer Transfer Processes. 


Single Buffer Process. 


The Single Buffer Process allows the DMA channel 
to transfer only one buffer of data. When the buffer 
has been completely transferred (Current Byte 
Count decremented past zero or EOP# input ac- 
tive), the DMA process ends and the channel be- 
comes idle. In order for that channel to be used 
again, it must be reprogrammed. 


The single Buffer Process is usually used when the 
amount of data to be transferred is known exactly, 
and it is also known that there is not likely to be any 
data to follow before the operating system can 
reprogram the channel. 


Buffer Auto-Initialize Process 


The Buffer Auto-Initialize Process allows multiple 
groups of data to be transferred to or from a single 
buffer. This process does not require reprogram- 
ming. The Current Registers are automatically repro- 
grammed from the Base Registers when the current 
process is terminated, either by an expired Byte 
Count or by an external EOP# signal. The data 
transferred will always be between the same Target 
and Requester. 


The auto-initialization/process-execution cycle is re- 
peated, with a HOLD/HLDA re-arbitration, until the 
channel is either disabled or re-programmed. 
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Buffer Chaining Process 


The Buffer Chaining Process is useful for transfer- 
ring large quantities of data into non-contiguous 
buffer areas. In this process, a single channel is 
used to process data from several buffers, while 
having to program the channel only once. Each new 


buffer is programmed in a pipelined operation that. 


provides the new buffer information while the old 
buffer is being processed. The chain is created by 
loading new buffer information while the 82380 DMA 
Controller is processing the Current Buffer. When 
the Current Buffer expires, the 82380 DMA Control- 
ler automatically restarts the channel using the new 
buffer information. 


Loading the new buffer information is done by an 
interrupt routine which is requested by the 82380. 
Interrupt Request 1 (IRQ1) is tied internally to the 
82380 DMA Controller for this purpose. IRQ1 is gen- 
erated by the 82380 when the new buffer informa- 
tion is loaded into the channel’s Current Registers, 
leaving the Base Registers ‘empty’. The interrupt 
service routine loads new buffer information into the 
Base Registers. The host processor is required to 
load the information for another buffer before the 
current Byte Count expires. The process repeats un- 
til the host programs the channel back to single buff- 
er operation, or until the channel runs out of buffers. 


The channel runs out of buffers when the Current 
Buffer expires and the Base Registers have not yet 
been loaded with new buffer information. When this 
occurs, the channel must be reprogrammed. 


If an external EOP # is encountered while executing 
a Buffer Chaining Process, the current buffer is con- 
sidered expired and the new buffer information is 
loaded into the Current Registers. If the Base Regis- 
ters are ‘empty’, the chain is terminated. 


The channel uses the Base Target Address Register 
as an indicator of whether or not the Base Registers 
are full. When the most significant byte of the Base 
Target Register is loaded, the channel considers all 
of the Base Registers loaded, and removes the in- 
terrupt request. This requires that the other Base 
Registers (Base Requester Address, Last Byte 
Count) must be loaded before the Base Target Ad- 
dress Register. The reason for implementing the re- 
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loading process this way is that, for most applica- 
tions, the Byte Count and the Requester will not 
change from one buffer to the next, and therefore do 
not need to be reprogrammed. The details of pro- 
gramming the channel for the Buffer Chaining Pro- 
cess can be found in the section of DMA program- 
ming. 


3.3.3 DATA TRANSFER MODES 


Three Data Transfer modes are available in the 
82380 DMA Controller. They are the Single Transfer, 
Block Transfer, and Demand Transfer Modes. 
These transfer modes can be used in conjunction 
with any one of three Buffer Transfer modes: Single 
Buffer, Auto-Initialized Buffer, and Buffer Chaining. 
Any Data Transfer Modes can be used under any of 
the Buffer Transfer Modes. These modes are inde- 
pendently available for all DMA channels. 


Different devices being serviced by the DMA Con- 
troller require different handshaking sequences for 
data transfers to take place. Three handshaking 
modes are available on the 82380, giving the de- 
signer the opportunity to use the DMA Controller as 
efficiently as possible. The speed at which data can | 
be presented or read by a device can affect the way 
a DMA controller uses the host’s bus, thereby affect- 
ing not only data throughput during the DMA pro- 
cess, but also affecting the host’s performance by 
limiting its access to the bus. 


Single Transfer Mode 


In the Single Transfer Mode, one data transfer to or 
from the Requester is performed by the DMA Con- 
troller at a time. The DREQn input is arbitrated and 
the HOLD/HLDA sequence is executed for each 
transfer. Transfers continue in this manner until the 
Byte Count expires, or until EOP # is sampled active. 
If the DREQn input is held active continuously, the 
entire DREQ-HOLD-HLDA-DACK sequence is re- 
peated over and over until the programmed number 
of bytes has been transferred. Bus control is re- 
leased to the host between each transfer. Figure 3-4 
shows the logical flow of events which make up a 
buffer transfer using the Single Transfer Mode. Re- 
fer to section 3.4 for an explanation of the bus con- 
trol arbitration procedure. 
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The Single Transfer Mode is used for devices which 
| en , require complete handshake cycles with each data 
INITIALIZE BUFFER fe access. Data is transferred to or from the Requester 
: only when the Requester is ready to perform the 

transfer. Each transfer requires the entire DREQ- 

WAIT FOR DREOn HOLD-HLDA-DACK handshake cycle. Figure 3-5 


OR SOFTWARE REQUEST shows the timing of the ingle Transfer Mode cy: 
cles. . 
EXECUTE 7 | _— 
ONE REQUESTER | Block Transfer Mode | 


TRANSFER 


In the Block Transfer Mode, the DMA process is ini- 
tiated by a DMA request and continues until the Byte 
count expires, or until EOP # is activated by the Re- 
quester. The DREQn signal need only be held active 
until the first Requester access. Only a refresh cycle 
will interrupt the block transfer process. 


END OF BUFFER 


290128-19 Figure 3-6 illustrates the operation of the DMA dur- 
: - ing the Block Transfer Mode. Figure 3-7 shows the 
Figure 3-4. Buffer Transfer in timing of the handshake ene during Block Mode 
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Figure 3-5. DMA Single Transfer Mode 
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Demand Transfer Mode 


DEE EUEE ER The Demand Transfer Mode provides the most flex- 


ible handshaking procedures during the DMA pro- 
cess. A Demand Transfer is initiated by a DMA re- 
quest. The process continues until the Byte Count 
expires, or an external EOP # is encountered. If the 
device being serviced (Requester) desires, it can in- 


TRANSFER DATA UNTIL terrupt the DMA process by de-activating the 
EOP OR TC : DREQn line. Action is taken on the condition of 


DREQn during Requester accesses only. The ac- 
cess during which DREQn is sampled inactive is the 


END OF BUFFER 


290128-21 last Requester access which will be performed dur- 

ing the current transfer. Figure 3-8 shows the flow of 

Figure 3-6. Buffer Transfer in events during the transfer of a buffer in the Demand 
Block Transfer Mode Mode. 
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Figure 3-7. Block Mode Transfers 
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Figure 3-8. Buffer Transfer in 
Demand Transfer Mode 


When the DREQn line goes inactive, the DMA con- 
troller will complete the current transfer, including 
any necessary accesses to the Target, and relin- 
quish control of the bus to the host. The current pro- 
cess information is saved (byte count, Requester 
and Target addresses, and Temporary Register). 
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The Requester can restart the transfer process by 


reasserting DREQn. The 82380 will arbitrate the re- 


~ quest with other pending requests and begin the 


process where it left off. Figure 3-9 shows the timing 


of handshake signals during Demand Transfer Mode 


operation. 


Using the Demand Transfer Mode allows peripherals 
to access memory in small, irregular bursts without 
wasting bus control time. The 82380 is designed to 
give the best possible bus control latency in the De- 
mand Transfer Mode. Bus control latency is defined 
here as the time from the last active bus cycle of the 
previous bus master to the first active bus cycle of 
the new bus master. The 82380 DMA Controller will 
perform its first bus access cycle two bus states af- 
ter HLDA goes active. In the typical configuration, 
bus control is returned to the host one bus state 
after the DREQn goes inactive. 


There are two cases where there may be more than . 
one bus state of bus control latency at the end of a 
transfer. The first is at the end of an Auto-Initialize 
process, and the second is at the end of a process 
where the source is the Requester and Two-Cycle 
transfers are used. 


When a Buffer Auto-Initialize Process is complete, 
the 82380 requires seven bus states to reload the 
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Figure 3-9. Demand Mode Transfers 
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Current Registers from the Base Registers of the 
Auto-Initialized channel. The reloading is done while 
the 82380 is still the bus master so that it is prepared 
to service the channel immediately after relinquish- 
ing the bus, if necessary. 


In the case where the Requester is the source, and 
Two-Cycle transfers are being used, there are two 
extra idle states at the end of the transfer process. 
This occurs due to housekeeping in the DMA’s inter- 
nal pipeline. These two idle states are present only 
after the very last Requester access, before the 
DMA Controller de-activates the HOLD signal. 


3.3.4 CHANNEL PRIORITY ARBITRATION 


DMA channel priority can be programmed into one 
of two arbitration methods: Fixed or Rotating. The 
four lower DMA channels and the four upper DMA 
channels operate as if they were two separate DMA 
controllers operating in cascade. The lower group of 
four channels (0-3) is always prioritized between 
channels 7 and 4 of the upper group of channels (4- 
7). Figure 3-10 shows a pictorial representation of 
the priority grouping. | 


The priority can thus be set up as rotating for one 
group of channels and fixed for the other, or any 
other combination. While in Fixed Priority, the pro- 
grammer can also specify which channel has the 
lowest priority. 


| CHANNEL 2 | 
| CHANNEL 1 | 


PHANTOM 
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Figure 3-10. DMA Priority Grouping 


The 82380 DMA Controller defaults to Fixed Priority. 
Channel 0 has the highest priority, then 1, 2, 3, 4, 5, 
6, 7. Channel 7 has the lowest priority. Any time the 
DMA Controller arbitrates DMA requests, the re- 
questing channel with the highest priority will be 
serviced next. 


Fixed Priority can be entered into at any time by a 
software command. The priority levels in effect 
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after the mode switch are determined by the current 
setting of the Programmable Priority. 


Programmable Priority is available for fixing the prior- 
ity of the DMA channels within a group to levels oth- 
er than the default. Through a software command, 
the channel to have the lowest priority in a group 
can be specified. Each of the two groups of four 
channels can have the priority fixed in this way. The 
other channels in the group will follow the natural 
Fixed Priority sequence. This mode affects only the 
priority levels while operating with Fixed Priority. 


For example, if channel 2 is programmed to have the 
lowest priority in its group, channel 3 has the highest 
priority. In descending order, the other channels 
would have the following priority: (3, 0, 1, 2), 4, 5, 6, 
7 (channel 2 lowest, channel 3 highest). If the upper 
group were programmed to have channel 5 as the 
lowest priority channel, the priority would be (again, 
highest to lowest): 6, 7, (3, 0, 1, 2), 4, 5. Figure 3-11 
shows this example pictorially. The lower group is 
always prioritized as a fifth channel of the upper 
group (between channels 4 and 7). 


High Priority 


CHANNEL 6 
CHANNEL 7 
PHANTOM 


/ CHANNEL 4 
CHANNEL 5 


Low Priority 
290128-26 


Figure 3-11. Example of Programmed Priority 


The DMA Controller will only accept Programmable 
Priority commands while the addressed group is op- 
erating in Fixed Priority. Switching from Fixed to Ro- 
tating Priority preserves the current priority levels. 
Switching from Rotating to Fixed Priority returns the 
priority levels to those which were last programmed 
by use of Programmable Priority. 


Rotating Priority allows the devices using DMA to 
share the system bus more evenly. An: individual 
channel does not retain highest priority after being 
serviced, priority is passed to the next highest priori- 
ty channel in the group. The channel which was 
most recently serviced inherits the lowest priority. 
This rotation occurs each time a channel is serviced. 
Figure 3-12 shows the sequence of events as priori- 
ty is passed between channels. Note that the lower 
group rotates within the upper group, and that serv- 
icing a channel within the lower group causes rota- 
tion within the group as well as rotation of the upper 


group. 
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OmEE GERD —detauit (highest o lowest) 


-~DREQ2 and DRE QG- process channel 2 


| TeTeT7] BEE —channel 2 drops to lowest priority within group. 


Lower group drops to lowest priority within HPP group. 
2(Double note) 


DREQ6 (still) and DREQ7—process channel 6. 


7] [sJo[s [2] Tals|e —channel 6 Baan to lowest priority within group 


DREQ7 (still) and DREQO—process channel 7 


| [sfol1]2| [als[e[7| —channel 7 eroRe to lowest priority within group 


DREQO poe and pier irecry channel 0 


| | —channe! 0 pape to iGwest priority within areue (Double Rotation) 


ale. a a a channel 1 


ERE AEZER feTstola]- —channel 1 drops to lowest priority within group 


Figure 3-12. Rotating Channel Priority. Lower and Upper 
groups are programmed for the Rotating Priority Mode. 
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3.3.5 COMBINING PRIORITY MODES ity modes between the two groups of channels: 

Fixed Priority only (default), Fixed Priority upper 
Since the DMA Controller operates as two four- group/Rotating Priority lower group, Rotating Priority 
channel controllers in cascade, the overall priority upper group/Fixed Priority lower group, and Rotating 
scheme of all eight channels can take on a variety of Priority only. Figure 3-13 illustrates the operation of 


forms. There are four possible combinations of prior- the two combined priority methods. 


—Default priority 
After servicing channel 2 
—After servicing channel 6 


—After servicing channel 1 


Default priority 


a 
a 
- 


After servicing channel 2 


le 
= 
a 
Eg 
eo 
~ 


After servicing channel 6 


a 
a 
- 


After servicing channel 1 


CASE 2 0-3 Rotating Priority, 4-7 Fixed Priority 


Figure 3-13. Combining Priority Modes 
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_ 3.3.6 BUS OPERATION 


Data may be transferred by the DMA Controller us- 
ing: two different bus cycle operations: Fly-By (one- 


| cycle) and Two-Cycle. These bus handshake meth- — 


ods are selectable independently for each channel 
through a command. register. Device data path 
_widths are independently programmable for both 
Target and Requester. Also selectable through soft- 
- ware is the direction of data transfer. All of these 
parameters affect the operation of the 82380 on a 
bus-cycle by bus-cycle basis. | | 


3.3.6.1 Fly-By Transfers 


The Fly-By Transfer Mode is the fastest and most 
efficient way to use the 82380 DMA Controller to 
transfer data. In this method of transfer, the data is 
written to the destination device at the same time it 
is read from the source. Only one bus 2 is used 
to accomplish the transfer. 


In the Fly-By Mode, the DMA acknowledge signal is 
used to select the Requester. The DMA Controller 
simultaneously places the address of the Target on 
_the address bus. The state of M/IO# and W/R# 
during the Fly-By transfer cycle indicate the type of 


Target and whether the target is being written to or | 


read from. The Target’s Bus Size is used as an in- 
crementer for the Byte Count. The Requester ad- 


dress registers are ignored during Fly-By transfers. — 


Note that memory-to-memory transfers cannot be 
done using the Fly-By Mode. Only one memory or 
1/O address is generated by the DMA Controller at a 
‘time during Fly-By transfers. Only one of the devices 
being accessed can be selected by an address. 
Also, the Fly-By method of data transfer limits the 
hardware to accesses of devices with the same data 
bus width. The Temporary Registers are not affect- 
ed in the Fly-By Mode. 


Fly-By transfers also require that the data paths of 
the Target and Requester be directly connected. 
This requires that successive Fly-By accesses be to 
doubleword boundaries, or that the Requester be 
capable of switching its connections to the data bus. 


3.3.6.2 Two-Cycle Transfers 


Two-Cycle transfers can also be performed by the 
82380 DMA Controller. These transfers require at 


least two bus cycles to execute. The data being © 


transferred is read into the DMA Controller’s Tempo- 
rary Register during the first bus cycle(s). The sec- 
ond bus cycle is used to write the data from the 
Temporary Register to the destination. 
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if the addresses of the data being transferred are 


not word or doubleword aligned, the 82380 will rec- 


ognize the situation and read and write the data in 
groups of bytes, placing them always at. the proper 
destination. This process of collecting the desired 


‘bytes and putting them together is called ‘byte as- 


sembly’. The reverse process (reading from aligned 
locations and writing to non-aligned locations) is 


called ‘byte disassembly’. 


The assembly/disassembly process takes place 
transparent to the software, but can only be done 
while using the Two-Cycle transfer method. The 
82380 will always perform the assembly/disassem- 
bly process as necessary for the current data trans- - 
fer. Any data path widths for either the Requester or 
Target can be used in the Two-Cycle Mode. This is 
very convenient for interfacing existing 8- and 16-bit 
peripherals to the 80386’s 32-bit bus. 


The 82380 DMA Controller always attempts to fill 
the Temporary Register from the source before writ- 
ing any data to the destination. If the process is ter- 
minated before the Temporary Register is filled (TC 
or EOP #), the 82380 will write the partial data to the 
destination. If a process is temporarily suspended 
(such as when DREQn is de-activated during a de- 
mand transfer), the contents of a partially filled Tem- 
porary Register will be stored within the 82380 until 
the process is restarted. 


For example, if the source is specified as an 8-bit 
device and the destination as a 32-bit device, there 
will be four reads as necessary from the 8-bit source 
to fill the Temporary Register. Then the 82380 will 
write the 32-bit contents to the destination. This cy- 


cle will repeat until the process is terminated or sus- 


pended. 


Note that for a Single-Cycle transfer mode of opera- 
tion (see section 3.3.3), the internal circuitry of the 
DMA Controller actually executes single transfers by 
removing the DREQ from the internal arbitration. 


‘Thus single transfers from an 8-bit requester to a 32- 


bit target will consist of four complete and indepen- 
dent 8-bit requester cycles, between which bus con- 
trol is released and re-requested. Finally, the 32-bit 
data will be transferred to the target device from the 
temporary register before the fifth requester cycle. 


With Two-Cycle transfers, the devices that the 
82380 accesses can reside at any address within 
I/O or memory space. The device must be able to 
decode the byte-enables (BEn#). Also, if the device 
cannot accept data in byte quantities, the program- 
mer must take care not to allow the DMA Controller 
to access the device on any address other than the 
device boundary. 
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3.3.6.3 Data Path Width and Data Transfer Rate 
Considerations 


The number of bus cycles used to transfer a single 
‘word’ of data is affected by whether the Two-Cycle 
or the Fly-By (Single-Cycle) transfer method is used. 


The number of bus cycles used to transfer data di- 
rectly affects the data transfer rate. Inefficient use of 
bus cycles will decrease the effective data transfer 
rate that can be obtained. Generally, the data trans- 
fer rate is halved by using Two-Cycle transfers in- 
stead of Fly-By transfers. 


The choice of data path widths of both Target and 
Requester affects the data transfer rate also. During 
each bus cycle, the largest pieces of data possible 
should be transferred. : 


The data path width of the devices to be accessed 
must be programmed into the DMA controller. The 
82380 defaults after reset to 8-bit-to-8-bit data trans- 
fers, but the Target and Requester can have differ- 
ent data path widths, independent of each other and 
independent of the other channels. Since this is a 
software programmable function, more discussion of 
the uses of this feature are found in the section on 
programming. : 


3.3.6.4 Read, Write, and Verify Cycles 


Three different bus cycle types may be used in a 
data transfer. They are the Read, Write, and Verify 
cycles. These cycle types dictate the way in which 
the 82380 operates on the data to be transferred. 


A Read Cycle transfers data from the Target to the 
Requester. A Write Cycle transfers data from the 
Requester to the target. In a Fly-By transfer, the ad- 
dress and bus status signals indicate the access 
(read or write) to the Target; the access to the Re- 
quester is assumed to be the opposite. 


The Verify Cycle is used to perform a data read only. 
No write access is indicated or assumed in a Verify 
Cycle. The Verify Cycle is useful for validating block 
fill operations. An external comparator must be pro- 
vided to do any comparisons on the data read. 


3.4 Bus Arbitration and Handshaking 


Figure 3-14 shows the flow of events in the DMA 
request arbitration process. The arbitration se- 
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quence starts when the Requester asserts a DREQn 
(or DMA service is requested by software). Figure 
3-15 shows the timing of the sequence of ‘events 
following a DMA request. This sequence is executed 
for each channel that is activated. The DREQn sig- 
nal can be replaced by a software DMA channel re- 
quest with no change in the sequence. 


WAIT FOR DREQn OR SOFTWARE REQUEST 


REQUESTER ASSERTS DREQn 


82380 ASSERTS HOLD REQUEST 1. 


80386 ASSERTS HOLD ACKNOWLEDGE 


82380 ARBITRATES PENDING REQUESTS 


82380 PERFORMS HIGHEST PRIORITY 
TRANSFER (SEE DATA TRANSFER MODES) 


82380 DE=ASSERTS HOLD REQUEST 
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Figure 3-14. Bus Arbitration and DMA Sequence 


After the Requester asserts the service request, the 
82380 will request control of the bus via the HOLD 
signal. The 82380 will always assert the HOLD sig- 
nal one bus state after the service request is assert- 
ed. The 80386 responds by asserting the HLDA sig- 
nal, thus releasing control of the bus to the 82380 
DMA Controller. 


Priority of pending DMA service requests is arbitrat- 
ed during the first state after HLDA is asserted by 
the 80386. The next state will be the beginning of 
the first transfer access of the highest priority pro- 
cess. 
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When the 82380 DMA Controller is finished with its 


current bus activity, it returns control of the bus to 


the host processor. This is done by driving the 
HOLD signal inactive. The 82380 does not drive any 
address or data bus signals after HOLD goes low. It 
enters the Slave Mode until another DMA process is 
requested. The processor acknowledges that it has 
regained control of the bus by forcing the HLDA sig- 
nal inactive. Note that the 82380’s DMA Controller 
will not re-request control of the bus until the entire 
HOLD/HLDA handshake sequence is complete. 


The 82380 DMA Controller will terminate a current 
DMA process for one of three reasons: expired byte 
count, end-of-process command (EOP # activated) 
from a peripheral, or de-activated DMA request sig- 
nal. In each case, the controller will de-assert HOLD 
immediately after completing the data transfer in 
progress. These three methods of process termina- 
tion are illustrated in Figures 3-16, 3-19, and 3-18, 
respectively. 
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An expired byte count indicates that the current pro- 


cess is complete as programmed and the channel 
has no further transfers to process. The channel 
must be restarted according to the currently pro- 
grammed Buffer Transfer Mode, or reprogrammed 
completely, including a new Buffer Transfer Mode. 


lf the peripheral activates the EOP # signal, it is indi- 
cating that it will not accept or deliver any more data 
for the current buffer. The 82380 DMA Controller 
considers this as a completion of the channel’s cur- 
rent. process and interprets the condition the same 
way as if the byte count expired. . 


The action taken by the 82380 DMA Controller in 
response to a de-activated DREQn signal depends 
on the Data Transfer Mode of the channel. In the 
Demand Mode, data transfers will take place as long 
as the DREQn is active and the byte count has not 
expired. In the Block Mode, the controller will com- 
plete the entire block transfer without relinquishing 
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Channel priority resolution takes place during the bus state before HLDA is asserted, allowing the DMA Controller to 


respond to HLDA without extra idle bus states. 


Figure 3-15. Beginning of a DMA process 
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the bus, even if DREQn goes inactive before the 
transfer is complete. In the Single Mode, the control- 
ler will execute single data transfers, relinquishing 
the bus between each transfer, as long as DREQn is 
active. 


Normal termination of a DMA process due to expira- 
tion of the byte count (Terminal Count-TC) is shown 


Single 
or Chaining- 
Base Empty 


Buffer Process: 


Event 


Terminal Count 
EOP # Input 


Results 

Current Registers 
Channel Mask 

EOP # Output 
Terminal Count Status 
Software Request 
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in Figure 3-16. The condition of DREQn is ignored 
until after the process is terminated. If the channel is 
programmed to auto-initialize, HOLD will be held ac- 
tive for an additional seven clock cycles while the 
auto-initialization takes place. 


Table 3-3 shows the DMA channel activity due to 
EOP # or Byte Count expiring (Terminal Count). 


Auto- 
Initialize 


Chaining- 
Base Loaded 
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Figure 3-16. Termination of a DMA Process Due to Expiration of Current Byte Count 
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The. 82380 always relinquishes control of the bus 
between channel services. This allows the hardware 
designer the flexibility to externally arbitrate bus hold 
requests, if desired. If another DMA request is pend- 
ing when a higher priority channel service is com- 
pleted, the 82380 will relinquish the bus until the 
hold acknowledge is inactive. One bus state after 
the HLDA signal goes inactive, the 82380 will assert 
HOLD again. This is illustrated in Figure 3-17. 


3.4.1 SYNCHRONOUS AND ASYNCHRONOUS 
SAMPLING OF DREQn AND EOP# 


As an indicator that a DMA service is to be started, 
DREQn is always sampled asynchronously. It is 
sampled at the beginning of a bus state and acted 
upon at the end of the state. Figure 3-15 illustrates 
the start of a DMA process due to a DREQn input. 


The DREQn and EOP # inputs can be programmed 
to be sampled either synchronously or asynchro- 
nously to signal the end of a transfer. 


The synchronous mode affords the Requester one 

bus state of extra time'to react to an access. This 

means the Requester can terminate a process on 

the current access, without. losing any data. The 

asynchronous mode requires that the input signal be 

presented prior to the beginning of the last state of 
the Requester access. 
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_ The timing relationships of the DREQn and EOP# 


signals to the termination of a DMA transfer are 
shown in Figures 3-18 and 3-19. Figure 3-18 shows 
the termination of a DMA transfer due to inactive 
DREQn. Figure 3-19 shows the termination of a 
DMA process due to an active EOP # input. 


In the Synchronous Mode, DREQn and EOP*¥# are 
sampled at the end of the last state of every Re- 
quester data transfer cycle. If EOP# is active or 
DREQan is inactive at this time, the 82380 recognizes 
this access to the Requester as the last transfer. At. 
this point, the 82380 completes the transfer in prog- 
ress, if necessary, and returns bus control to the 
host. | 


In the asynchronous mode, the inputs are sampled 
at the beginning of every state of a Requester ac- 
cess. The 82380 waits until the end ort the state to 
act on the input. 


DREQn and EOP# are sampled at the latest possi- 
ble time when the 82380 can determine if another 
transfer is required. In the Synchronous Mode, 
DREQn and EOP # are sampled on the trailing edge 
of the last bus state before another data access cy- 
cle begins. The Asynchronous Mode requires that 
the signals be valid one clock cycle earlier. 
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HIGHER PRIORITY DREQb TTTTTTLLLLLLT . 
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CHANNEL A a 


|+———— CHANNEL B 
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Figure 3-17. Switching between Active DMA Channels 
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Figure 3-18. Termination of a DMA Process Due to De-Asserting DREQn 
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Figure 3-19. Termination of a DMA Process Due to an External EOP # 
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While in the Pipeline Mode, if the NA# signal is sam- 
pled active during a transfer, the end of the state 


where NA# was sampled. active is when the 82380 | 


decides whether to commit to another transfer. The 
device must de-assert DREQn or assert EOP # be- 


fore NA# is asserted, otherwise the 82380 will com- | 


mit to another, possibly undesired, transfer. 


Synchronous DREQn and EOP# sampling allows 
the peripheral to prevent the next transfer from oc- 
curring by de-activating DREQn or asserting EOP# » 
during the current Requester access, before the _ 


82380 DMA Controller commits itself to another 
transfer. The DMA Controller will not perform the 
next transfer if it has not already begun the bus cy- 
cle. Asynchronous sampling allows less stringent 
timing requirements than the Synchronous Mode, 
but requires that the DREQn signal be valid at the 


beginning of the next to last bus state of the current — 


Requester access. — 


Using the Asynchronous Mode with zero wait states 
can be very difficult. Since the addresses and con- 
trol signals are driven by the 82380 near half-way 
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through the first bus state of a transfer, and the 
Asynchronous Mode requires that DREQn be active 
before the end of the state, the peripheral being .ac- 
cessed is required to present DREQn only a few 
nanoseconds after the control information is avail- 
able. This means that the peripheral’s control logic 
must be extremely fast (practically non- -Causal). An 
allemative is the ener Mode. 


3.4.2 ARBITRATION OF CASCADED MASTER 
REQUESTS 7 


The Cascade Mode allows another DMA-type de- 
vice to share the bus by arbitrating its bus accesses 
with the 82380’s. Seven of the eight DMA channels 
(O-3 and 5—7) can be connected to a cascaded de- 
vice. The cascaded device requests bus control 
through the DREQn line of the channel which is pro- 
grammed to operate in Cascade Mode. Bus hold ac- 
knowledge is signaled to the cascaded device 
through the EDACK lines. When the EDACK lines 
are active with the code for the requested cascade 
channel, the bus is available to the cascaded master - 
device. 


BUS 
MASTER O 
HOLD REQUEST 


© LATCHED 


DECODER | HOLD ACKNOWLEDGE 


BUS 
MASTER n 


HOLD REQUEST 
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Figure 3-20. Cascaded Bus Master 
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A Cascade cycle begins the same way a regular 
DMA cycle begins. The requesting bus master as- 
serts the DREQn line on the 82380. This bus control 
request arbitrated as any other DMA request would 
be. If any channel receives a DMA request, the 
82380 requests control of the bus. When the host 
acknowledges that it has released bus control, the 
82380 acknowledges to the requesting master that it 
may access the bus. The 82380 enters an idle state 
until the new master relinquishes control. 


A cascade cycle will be terminated by one of two 
events: DREQn going inactive, or HLDA going inac- 
tive. The normal way to terminate the cascade cycle 


HLDA Nt 


Cascade cycle termination by DREQn inactive 
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is for the cascaded master to drop the DREQn sig- 
nal. Figure 3-21 shows the two cascade cycle termi- 
nation sequences. 


The Refresh Controller may interrupt the cascaded 
master to perform a refresh cycle. If this occurs, the 
82380 DMA Controller will de-assert the EDACK sig- 
nal (hold acknowledge to cascaded master) and wait 
for the cascaded master to remove its hold request. 
When the 82380 regains bus control, it will perform 
the refresh cycle in its normal fashion. After the re- 
fresh cycle has been completed, and if the cascad- 
ed device has re-asserted its request, the 82380 will 
return control to the cascaded master which was in- 
terrupted. . 
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Figure 3-21. Cascade Cycle Termination 
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The 82380 assumes that it is the only device moni- 
toring the HLDA signal. If the system designer 
wishes to place other devices on the bus as bus 
masters, the HLDA from the processor must be in- 
tercepted before presenting it to the 82380. Using 
the Cascade capability of the 82380 DMA Controller 
offers a much better solution. = 


3. A3 ARBITRATION OF Beer REQUESTS. 


The arbitration of refresh requests by the DRAM Re- 
fresh Controller is slightly different from normal DMA 
channel request arbitration. The 82380 DRAM Re- 
fresh Controller always has the highest priority of 
any DMA process. It also can interrupt a process in 
progress. Two types of processes in progress may 
be encountered: normal DMA, and bus master cas- 
cade. 


In the event of a refresh request during a normal | 


DMA process, the DMA Controller will complete the 
data transfer in progress and then execute the re- 
fresh cycle before continuing with the current DMA 
process. The priority of the interrupted process is 


not lost. If the data transfer cycle interrupted by the © 


Refresh Controller is the last of a DMA process, the 
refresh cycle will always be executed before control 
of the bus is transferred back to the host. 


When the Refresh Controller request occurs during 


: _ acascade cycle, the Refresh Controller must be. as- 


sured that the cascaded master device has relin- 
quished control of the bus before it can execute the 
refresh cycle. To do this, the DMA Controller drops 
the EDACK signal to the cascaded master and waits 
for the corresponding DREQn input to go inactive. 
By dropping the DREQn signal, the cascaded mas- 
ter relinquishes the bus. The Refresh Controller then 
performs the refresh cycle. Control of the bus is re- 
turned to the cascaded master if DREQn returns to 
an active state before the end of the refresh cycle, 
otherwise control is passed to the processor and the 
cascaded master loses its priority. 


3.5 DMA Controller Register Overview 


_ The 82380 DMA Controller contains 44 registers 
which are accessable to the host processor. Twen- 
ty-four of these registers contain the device ad- 
dresses and data counts for the individual DMA 
channels (three per channel). The remaining regis- 
ters are control and status registers for initiating and 
-monitoring the operation of the 82380 DMA Control- 
ler. Table 3-4 lists the DMA Controller's registers 
and their accessability. 
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Register Name _ Access 
Control/Status Register—One Each Per : 
Group 
Command Register | 
Command Register II 
Mode Register |. 
Mode Register II 
Software Request Register 
Mask Set-Reset Register 
Mask Read-Write Register | 


Write Only , 
‘Write Only | 
Write Only . 
Write Only | 
~ Read/Write 
Write Only 
Read/Write 
Read Only 
Write Only 
Read/Write 


Channel Registers—One Each Per Channel 


Base Target Address _ Write Only 
Current Target Address. Read Only 
Base Requester Address - Write Only 
Current Requester Address Read Only 
Base Byte Count Write Only — 
Current Byte Count Read Only 


Table 3-4. DMA Controller Registers 


Status Register 
Bus Size Register 
Chaining Register 


3.5.1 CONTROL/STATUS REGISTERS 


The following registers are available to the host 


processor for programming the 82380 DMA Control- 
ler into its various modes and for checking the oper- 
ating status of the DMA processes. Each set of four 
DMA channels has one of each of these registers 
associated with it. 


Command Register | 


Enables or disables the DMA channels as a group. | 
Sets the Priority Mode (Fixed or Rotating) of the 
group. This write-only register is cleared by a hard- — 
ware reset, defaulting to all channels enabled and 


_ Fixed Priority Mode. 


Command Register II 


Sets the sampling mode of the DREQn and EOP# 
inputs. Also sets the lowest priority channel for the 
group in the Fixed Priority Mode. The functions pro-- 
grammed through Command Register II default after 
a hardware reset to: asynchronous DREQn and 
EOP#, and channels 3 and 7 lowest priority. 


Mode Register | 
Mode Register | is identical in function to the Mode 


register of the 8237A. It programs the following func- 
tions for an individually selected channel: 
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Type of Transfer—read, write, verify 
Auto—Initialize—enable or disable 

Target Address Count—increment or 
decrement 

Data Transfer Mode—demand, single, block, 
cascade 


Mode Register | functions default to the following 
after reset: verify transfer, Auto-Initialize disabled, In- 
crement Target address, Demand Mode. 


Mode Register II 


Programs the following functions for an individually 


selected channel: 


Target Address Hold—enable or disable 
Requester Address Count—increment or 
decrement 

Requester Address Hold—enable or disable 
Target Device Type—!I/O or Memory 
Requester Device Type—l/O or Memory 
Transfer Cycles—Two-Cycle or Fly-By 


Mode Register || functions are defined as follows 
after a hardware reset: Disable Target Address Hold, 
Increment Requester Address, Target (and Re- 
quester) in memory, Fly-By Transfer Cycles. Note: 
Requester Device Type ignored in Fly-By Transfers. 


Software Request Register 


The DMA Controller can respond to service requests 
which are initiated by software. Each channel has an 
internal request status bit associated with it. The 
host processor can write to this register to set or 
reset the request bit of a selected channel. 


The status of the group’s software DMA service re- 
quests can be read from this register as well. Each 
request bit is cleared upon Terminal Count or exter- 
nal EOP #. a 


The software DMA requests are non-maskable and 
subject to priority arbitration with all other software 
and hardware requests. The entire register is 
cleared by a hardware reset. 


Mask Registers 


Each channel has associated with it a mask bit 
which can be set/reset to disable/enable that chan- 
nel. Two methods are available for setting and clear- 
ing the mask bits. The Mask Set/Reset Register is a 
write-only register which allows the host to select an 
individual channel and either set or reset the mask 
bit for that channel only. The Mask Read/Write Reg- 
ister is available for reading the mask bit status and 
for writing mask bits in groups of four. | 
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The mask bits of a group may be cleared in one step 
by executing the Clear Mask Command. See the 
DMA Programming section for details. A hardware 
reset sets all of the channel mask bits, disabling all 
channels. 


Status Register 


The Status register is a read-only register which con- 
tains the Terminal Count (TC) and Service Request 
status for a group. Four bits indicate the TC status 
and four bits indicate the hardware request status 
for the four channels in the group. The TC bits are 
set when the Byte Count expires, or when an exter- 
nal EOP# is asserted. These bits are cleared by 
reading from the Status Register. The Service Re- 
quest bit for a channel indicates when there is a 
hardware DMA request (DREQn) asserted for that 
channel. When the request has been removed, the 
bit is cleared. 


Bus Size Register 


This write-only register is used to define the bus size 
of the Target and Requester of a selected channel. 
The bus sizes programmed will be used to dictate 
the sizes of the data paths accessed when the DMA 
channel is active. The values programmed into this 
register affect the operation of the Temporary Regis- 
ter. Any byte-assembly required to make the trans- 
fers using the specified data path widths will be done 
in the Temporary Register. The Bus Size register of 
the Target is used as an increment/decrement value 
for the Byte Counter and Target Address when in 
the Fly-By Mode. Upon reset, all channels default to 
8-bit Targets and 8-bit Requesters. 


Chaining Register 


As a command or write register, the Chaining regis- 
ter is used to enable or disable the Chaining Mode 
for a selected channel. Chaining can either be dis- 
abled or enabled for an individual channel, indepen- 
dently of the Chaining Mode status of other chan- 
nels. After a hardware reset, all channels default to 
Chaining disabled. 


When read by the host, the Chaining Register pro- 
vides the status of the Chaining Interrupt of each of 
the channels. These interrupt status bits are cleared 
when the new buffer information has been loaded. 


3.5.2 CHANNEL REGISTERS 


Each channel has three individually programmable 
registers necessary for the DMA process; they are 
the Base Byte Count, Base Target Address, and 
Base Requester Address registers. The 24-bit Base 
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Byte Count register contains the number of bytes to 
be transferred by the channel. The 32-bit Base Tar- 
get Address Register contains the beginning ad- 
dress (memory or I/O) of the Target device. The 32- 

bit Base Requester Address register contains the 
~ base address (memory or I/O) of the device which i is 
to request DMA service. 


Three more registers for each DMA channel exist 
within the DMA Controller which are directly related 
to the registers mentioned above. These registers 
contain the current status of the DMA process. They 
are the Current Byte Count register, the Current Tar- 
get Address, and the Current Requester Address. It 
is these registers which are manipulated (increment- 
ed, decremented, or held constant) by the 82380 
DMA Controller during the DMA process. The Cur- 
rent registers are loaded from the Base registers. 


The Base registers are loaded when the host proc- 
essor writes to the respective channel register ad- 
dresses. Depending on the mode in which the chan- 
nel is operating, the Current registers are typically 
loaded in the same operation. Reading from the 
channel register addresses yields the contents of 
the corresponding Current register. 


To maintain compatibility with software which ac- 
cesses an 8237A, a Byte Pointer Flip-Flop is used to 
control access to the upper and lower bytes of some 
words of the Channel Registers. These words are 


accessed as byte pairs at single port addresses. The » 


Byte Pointer Flip-Flop acts as a one-bit pointer 
which is toggled each time a qualifying Channel 
Register byte is accessed. It always points to the 
next logical byte to be accessed of a pair of bytes. 


The Channel registers are arranged as pairs of 
words, each pair with its own port address. Address- 
ing the port with the Byte Pointer Flip-Flop reset ac- 
cesses the least significant byte of the pair. The 
most significant byte is accessed when the Byte 
Pointer is set. 


For compatibility with existing 8237A designs, there 
is one exception to the above statements about the 
Byte Pointer Flip-Flop. The third byte (bits 16-23) of 
_ the Target Address is accessed through its own port 


address. The Byte Pointer Flip-Flop is not affected 


by any accesses to this byte. 


The upper eight bits of the Byte Count Register are 
_ Cleared when the least significant byte of the regis- 
ter is loaded. This provides compatibility with soft- 
ware which accesses an 8237A. The 8237A has 
16-bit Byte Count Registers. 
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3.5.3 TEMPORARY REGISTERS 


Each channel has a 32-bit Tonporary Register used 
for temporary data storage during two-cycle DMA 
transfers. It is this register in which any necessary 
byte assembly and disassembly of non-aligned data 
is performed. Figure 3-22 shows how a block of data 
will be moved between memory locations with differ- 
ent boundaries. Note that the order of the data does 
not change. 


DESTINATION 


SOURCE 
20H 50H 
21H 51H 
22H 52H 
23H 53H 
24H 54H 
25H 55H 
26H 56H 
27H 57H 
58H — 
59H 
SAH 


Target = source = 00000020H | 
Requester = destination = 00000053H 
Byte Count = 000006H . 

Figure 3-22. Transfer of Data between Memory 
Locations with Different Boundaries. This will be 


the result, independent of data path width. 


If the destination is the Requester and an early pro- 
cess termination has been indicated by the EOP# 
signal or DREQn inactive in the Demand Mode, the 
Temporary Register is not affected. If data remains 
in the Temporary Register due to differences in data 
path widths of the Target and Requester, it will not 
be transferred or otherwise lost, but will be stored for 
later transfer. 


If the destination is the Target and the EOP # signal 
is sensed active during the Requester access of a . 
transfer, the DMA Controller will complete the trans- 


fer by sending to the Target whatever information is 


in the Temporary Register at the time of process 
termination. This implies that the Target could be 
accessed with partial data. For this reason it is ad- 
visable to have an I/O device designated as a Re- 
quester, unless it is capable of handling partial data 
transfers. 
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3.6 DMA Controller Programming 


Programming a DMA Channel to perform a needed 
DMA function is in general a four step process. First 
the global attributes of the DMA Controller are pro- 
grammed via the two Command Registers. These 
global attributes include: priority levels, channel 
group enables, priority mode, and DREQn/EOP # in- 
put sampling. | 


The second step involves setting the operating 
modes of the particular channel. The Mode Regis- 
ters are used to define the type of transfer and the 
handshaking modes. The Bus Size Register and 
Chaining Register may also need to be programmed 
in this step. 


The third step is setting up the channel is to load the 
Base Registers in accordance with the needs of the 
operating modes chosen in step two. The Current 
Registers are automatically loaded from the Base 
Registers, if required by the Buffer Transfer Mode in 
effect. The information loaded and the order in 
which it is loaded depends on the operating mode. A 
channel used for cascading, for example, needs no 
buffer information and this step can be skipped en- 
tirely. 


The last step is to enable the newly programmed 
channel using one of the Mask Registers. The chan- 
nel is then available to perform the desired data 
transfer. The status of the channel can be observed 
at any time through the Status Register, Mask Reg- 
ister, Chaining Register, and Software Request reg- 
ister. . 


Once the channel is programmed and enabled, the 
DMA process may be initiated in one of two ways, 
either by a hardware DMA request (DREQn) or a 
software request (Software Request Register). 


Once programmed to a particular Process/Mode 
configuration, the channel will operate in that config- 
uration until programmed otherwise. For this reason, 
restarting a channel after the current buffer expires 
does not require compiete reprogramming of the 
channel. Only those parameters which have 
changed need to be reprogrammed. The Byte Count 
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Register is always changed and must be repro- 
grammed. A Target or Requester Address Register 
which is incremented or decremented should be re- 
programmed also. | 


3.6.1 BUFFER PROCESSES 


The Buffer Process is determined by the Auto-Initial- 
ize bit of Mode Register | and the Chaining Register. 
If Auto-Initialize is enabled, Chaining should not be 
used. 


3.6.1.1 Single Buffer Process 


The Single Buffer Process is programmed by dis- 
abling Chaining via the Chaining Register and pro- 
gramming Mode Register | for non-Auto-Initialize. 


3.6.1.2 Buffer Auto-Initialize Process 


Setting the Auto-Initialize bit in Mode Register | is all 
that is necessary to place the channel in this mode. 
Buffer Auto-initialize must not be enabled simulta- 
neous to enabling the Buffer Chaining Mode as this 
will have unpredictable results. 


Once the Base Registers are loaded, the channel is 
ready to be enabled. The channel will reload its Cur- 
rent Registers from the Base Registers each time 
the Current Buffer expires, either by an expired Byte 
Count or an external EOP #. 


3.6.1.3 Buffer Chaining Process 


The Buffer Chaining Process is entered into from the 
Single Buffer Process. The Mode Registers should 
be programmed first, with all of the Transfer Modes 
defined as if the channel were to operate in the Sin- 
gle Buffer Process. The channel's Base and Current 
Registers are then loaded. When the channel has 
been set up in this way, and the chaining interrupt 
service routine is in place, the Chaining Process can 
be entered by programming the Chaining Register. 
Figure 3.23 illustrates the Buffer Chaining Process. 


An interrupt (IRQ1) will be generated immediately af- . 


ter the Chaining Process is entered, as the channel 
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then perceives the Base Registers as empty and in 
need of reloading. It is important to have the inter- 
rupt service routine in place at the time the Chaining 
Process is entered into. The interrupt request is re- 


moved when the most significant byte of the Base 


Target Address is loaded. 


The interrupt will occur again when the first buffer 
expires and the Current Registers are loaded from 
the Base Registers. The cycle continues until the 
Chaining Process is disabled, or the host fails to re- 
spond to IRQ1 before the Current Buffer expires. 


INSTALL IRQ1 INTERRUPT SERVICE ROUTINE | 


SET THE CHANNEL TO NON=CHAINING PROCESS 
_ PROGRAM THE MODE REGISTERS 


LOAD BASE REGISTERS FOR FIRST BUFFER 


2 SET THE CHANNEL TO CHAINING PROCESS 
(IRQ1 WILL BE ACTIVATED) 


ENABLE INTERRUPT 


(IRQ1- WILL NEED SERVICE = 
LOAD BASE REGISTERS) 


ENABLE THE CHANNEL 


| FROM THIS POINT, THE HOST CAN PERFORM ANOTHER 
TASK. THE INTERRUPT SERVICE ROUTINE LEFT BEHIND 
WILL MAINTAIN THE CHANNEL, 


 290128-36 


Figure 3-23. Flow of Events in the 
| Buffer Chaining Process — 
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Exiting the Chaining. Process can be done by reset- 
ting the Chaining Mode Register. If an interrupt is 
pending for the channel when the Chaining Register 
is reset, the interrupt request will be removed. The 
Chaining Process can be temporarily disabled by 
setting the channel’s Mask bit in the Mask Register. 


The interrupt service routine for IRQ1 has the re- 
sponsibility of reloading the Base Register as neces- 
sary. It should check the status of the channel to 
determine the cause of channel expiration, etc. It 
should also have access to operating system infor- 
mation regarding the channel, if any exists. The 
IRQ1 service routine should be capable of determin- 
ing whether the chain should be continued or termi- 
nated and act on that information. 


3.6.2 DATA TRANSFER MODES 


The Data Transfer Modes are selected via Mode. 
Register |. The Demand, Single, and Block Modes 
are selected by bits D6 and D7. The individual trans- 
fer type (Fly-By vs Two-Cycle, Read-Write-Verify, 
and |/O vs Memory) is programmed through both of 
the Mode registers. . 


3.6.3 CASCADED BUS MASTERS 


The Cascade Mode is set by writing ones to D7 and 
D6 of Mode Register |. When a channel is pro- 
grammed to operate in the Cascade Mode, all of the 
other modes associated with Mode Registers | and II 
are ignored. The priority and DREQn/EOP# defini- 
tions of the Command Registers will have the same 
effect on the channel’s operation as any other 
mode. 


3.6.4 SOFTWARE COMMANDS _ 


There are five port addresses which, when written 
to, command certain operations to be performed by 
the 82380 DMA Controller. The data written to these 
locations is not of consequence, writing to the loca- 
tion is all that is necessary to command the 82380 to 
perform the indicated function. Following are de-' 
scriptions of the command function, =  =—— 
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Clear Byte Pointer Flip-Flop—location O0OOCH 


Resets the Byte Pointer Flip-Flop. This command 
should be performed at the beginning of any access 
to the channel registers in order to be assured of 
beginning at a predictable place in the register pro- 
gramming sequence. 


Master Clear—location OOODH 


All DMA functions are set to their default states. This 
command is the equivalent of a hardware reset to 
the DMA Controller. Functions other than those in 
the DMA Controller section of the 82380 are not af- 
fected by this command. 


Clear Mask 


Register —Channels 0Q—3—location O0O0OEH 


Channels 4—7—location OOCEH 


_ Channel Registers 


Channel Register Name 


Channel 0 Target Address 


Byte Count 


Requester Address 


Channel 1 Target Address 


Byte Count 


Requester Address 


82380 


This command simultaneously clears the Mask Bits 
of all channels in the addressed group, ene all 
of the channels in the group. 


Clear TC Interrupt Request—location 001EH 


This command resets the Terminal Count Interrupt 
Request Flip-Flop. It is provided to allow the pro- 
gram which made a software DMA request to ac- 
knowledge that it has responded to the expiration of 
the requested channel(s). 


3./ Register Definitions 


The following diagrams outline the bit definitions and 
functions of the 82380 DMA Controller’s Status and 
Control Registers. The function and programming of 
the registers is covered in the previous section on 
DMA Controller Programming. An entry of ‘X’ as a bit 
value indicates “don’t care.” 


(Read Current, Write Base) 


Address Byte Bits 
(Hex) Pointer Accessed 


0 
1 
X 
0 
0 
1 
0 
0 
1 
0 
1 
0 
1 
x 
0 
0 
1 
0 
0 
1 
0. 
1 
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| | (Read Current, Write Base) _ 
Register Name. _ Address Byte ‘Bits | 
(Hex) Pointer _ Accessed 


Channel Registers 


Channel 


Channel 2 | Target Address 


Byte Count 


Requester Address 


Channel 3 Target Address 


Byte Count 


Requester Address 


Channel 4 Target Address 


Byte Count 


Requester Address 


— 


Channel 5 Target Address 


Byte Count 


Requester Address 


=O-7"O0000x+-oO0O/O0+-/0 0+ 00x +2oO0O/# 0-0 0 200K +0/;/7 0-002 00xX + 0 
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Channel Registers 


Channel Register Name 


Channel 6 


Target Address 


Byte Count 


Requester Address 


Channel 7 Target Address 


Byte Count 


Requester Address 


Command Register | (Write Only) 


Port Address—Channels 0—3—0008H 
Channels 4-—7—00C8H 


Command Register II (Write Only) 


Port Addresses—Channels 0-3—-001AH 
Channels 4—7—OODAH 
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(Read Current, Write Base) 


Address 
(Hex) 


Byte 
Pointer 


=—-O0O-/ O00 + O00 x -$ o;- 0-00? 00XxkK -~0O 


GROUP MASK 
0 =ENABLE CHANNELS 
1 =DISABLE CHANNELS 
PRIORITY 
O =FIXED PRIORITY 
1 =ROTATING PRIORITY 


Bits 
Accessed 


290128-—37 


DREQn SAMPLING 

EOP# SAMPLING 
0 = ASYNCHRONOUS 
1 =SYNCHRONOUS 


LOW PRIORITY LEVEL SET 
00 = CHANNEL 0(4) LOWEST 
01 = CHANNEL 
10 = CHANNEL 


1(5) LOWEST 
2(6) LOWEST 
11= CHANNEL  3(7) LOWEST 


2901 28-38 
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‘Mode Register | (Write Only) 


Port Addresses—Channels 0-3—000BH | 
Channels 4—7—00CBH 


per feof nf] af mj of or] co 
ho 


HANNE SELECT 
CHANNEL 

= CHANNEL 

to = CHANNEL 

11 = CHANNEL 


TRANSFER TYPE 
00 = VERIFY 
01 = WRITE 
10 = READ 


11 =ILLEGAL 
XX IF IN CASCADE MODE 


AUTO=INITIALIZE 
O=DISABLE, 1 = ENABLE 


TARGET INCREMENT/DECREMENT 
0 = INCREMENT TARGET 
1 = DECREMENT TARGET * 
X IF TARGET HOLD ENABLED 


DATA TRANSFER MODE 
00 = DEMAND MODE 
01 =SINGLE TRANSFER MODE 
10 =BLOCK MODE 


= CASCADE 
11 =CASCADE MODE 290128-39 


* Target and Requester DECREMENT is allowed only for byte transfers. 


Mode Register II (Write Only) 


Port Addresses—Channels 0—3—001BH 
Channels 4—7—OODBH 


D3 D2 Di = DO 


poy fro} wo} ee} om] mY] cr) col. 
wee ; CHANNEL SELECT 


SEE MODE REGISTER I 


tA) HOLD 
= INCREMENT/ DECREMENT 
= = HOLD 


REQUESTER INCREMENT 
O = INCREMENT 
1 =DECREMENT * 
X IF REQUESTER HOLD ENABLED 


REQRES TER HOLD 
= INCREMENT/ DECREMENT 
re = HOLD 


TARGET DEVICE TYPE 

O = MEMORY. 

1 = INPUT/OUTPUT 
REQUESTER DEVICE TYPE 

0 = MEMORY 

1 =INPUT/OUTPUT | 
TRANSFER CYCLES 

0 =ONE=CYCLE (FLY-BY) 


1 = TWO-CYCLE 
ene 290128-40 


* Target and Requester DECREMENT is allowed only for byte transfers. 
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Software Request Register (Read/Write) 


Port Addresses—Channels 0-—3—0009H 
Channels 4—7—00C9H 


Write Format: Software DMA Service Request 


CHANNEL SELECT 
SEE MODE REGISTER TI 


REQUEST SERVICE 
O = REMOVE REQUEST 


1 =ASSERT REQUEST 
290128-41 


Read Format: Software Requests Pending 


D7 D6 DS D4 D3 D2 DI DO 1 =REQUEST PENDING 
x | x] x | x [srs] sez] sri] sro | 

CHANNEL 0(4) REQUEST 
CHANNEL 1 (5) REQUEST 


CHANNEL 2(6) REQUEST 


CHANNEL 3(7) REQUEST 
290128-—42 


Mask Set/Reset Register Individual Channel Mask (Write Only) 


Port Addresses—Channels 0-3—000AH 
‘Channels 4—-7—O0CAH 


D7 D6 D5 D4 DS D2 D1 DO 
EA ESESESES EI 
| oat 


CHANNEL SELECT 
SEE MODE REGISTER I 


MASK SET BIT 
0 = CLEAR MASK (ENABLE) 


1 = SET MASK (DISABLE) 
290128-43 


5-1129 


intel 82380 


Mask Read/Write Register _ Group Channel Mask (Read/Write) 


Port Addresses—Channels 0-—3—O00FH 
Channels 4—7—OOCFH 


CHANNEL 0(4) MASK BIT 


CHANNEL 1 (5) MASK BIT 
CHANNEL 2(6) MASK BIT 
CHANNEL 3(7) MASK BIT 


MASK BIT =O -CHANNEL ENABLE 


= 1 -CHANNEL DISABLE 
290128-44 


Status Register Channel Process Status (Read Only) 


Port Addresses—Channels 0-3—0008H 
. Channels 4-7—00C8H 


D7 D6 DS D4 D3 D2 ODI po - 
a dd cd dd 
| | . CHANNEL 0(4) EXPIRED 1=EXPIRED 
, CHANNEL 1 (5) EXPIRED 


CHANNEL 2(6) EXPIRED 
CHANNEL 3(7) EXPIRED 


CHANNEL 0(4) REQUEST 1= REQUEST 
CHANNEL 1 (5) REQUEST PENDING 
CHANNEL 2(6) REQUEST ~ 


CHANNEL 3(7) REQUEST 
290128-45 


Bus Size Register Set Data Path Width (Write Only) 


Port Addresses—Channels 0-3—0018H 
Channels 4-7—00D8H 


D7 D6 D5 D4 D3 D2 ODI dO 
resfrosrfesfeso] oT Talo 
| } ol 


CHANNEL SELECT 
SEE MODE REGISTER I 


TARGET BUS SIZE 
REQUESTER BUS SIZE 
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Bus Size Encoding: 
00 = Reserved by Intel 10 = 16-bit Bus 
01 = 32-bit Bus 11 = 8-bit Bus 
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Chaining Register (Read/Write) 


Port Addresses—Channels 0—3—0019H 
Channels 4—7—O0OD9H 


Write Format: Set Chaining Mode 


D7 D6 ODS D4 D3 D2 D1 DO 
pepofofofotal ale 


Read Format: Channel Interrupt Status 


_ D7 D6 D5 D4 D3 D2 D1 DO 
DTT Poser Por foo 


3.8 8237A Compatibility 


The register arrangement of the 82380 DMA Con- 
troller is a superset of the 8237A DMA Controller. 
Functionally the 82380 DMA Controller is very differ- 
ent from the 8237A. Most of the functions of the 
8237A are performed also by the 82380. The follow- 
ing discussion points out the differences between 
the 8237A and the 82380. 


The 8237A is limited to transfers between I/O and 
memory only (except in one special case, where two 
channels can be used to perform memory-to-memo- 
ry transfers). The 82380 DMA Controller can transfer 
between any combination of memory and I/O. Sev- 
eral other features of the 8237A are enhanced or 
expanded in the 82380 and other features are add- 
ed. 


The 8237A. is an 8-bit only DMA device. For pro- 
gramming compatibility, all of the 8-bit registers are 
preserved in the 82380. The 82380 is programmed 
via 8-bit registers. The address registers in the 
82380 are 32-bit registers in order to support the 


82380 


CHANNEL SELECT 
SEE MODE REGISTER I 


CHAINING ENABLE BIT 
0 = DISABLE CHAINING MODE 


1=ENABLE CHAINING MODE 
290128-47 


CHANNEL 0(4) BASE EMPTY 


CHANNEL 3(7) BASE EMPTY 
290128-48 


80386’s 32-bit bus. The Byte Count Registers are 
24-bit registers, allowing support of larger data 
blocks than possible with the 8237A. 


All of the 8237A’s operating modes are supported 
by the 82380 (except the cumbersome two-channel 
memory-to-memory transfer). The 82380 performs 
memory-to-memory transfers using only one chan- 
nel. The 82380 has the added features of buffer 
pipelining (Buffer Chaining Process), programmable 
priority levels, and Byte Assembly. 


The 82380 also adds the feature of address regis- 
ters for both destination and source. These address- 
es may be incremented, decremented, or held con- 
stant, as required by the application of the individual 
channel. This allows any combination of destination 
and source device. 3 


- Each DMA channel has associated with it a Target 


and a Requester. In the 8237A, the Target is the 
device which can be accessed by the address regis- 
ter, the Requester is the device which is accessed 
by the DMA Acknowledge signals and must be an 
1/O device. | 
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4.0 Programmable Interrupt 
Controller (PIC) 


4.1 Functional Description 


The 82380 Programmable Interrupt Controller (PIC) 
consists of three enhanced 82C59A Interrupt Con- 
tollers. These three controllers together provide 15 
external and 5 internal interrupt request inputs. Each 


external request input can be cascaded with an ad- | 


ditional 82C59A slave collector. This scheme allows 
the 82380 to support a maximum of 120 (15 x - 
external interrupt request inputs. 


Following one or more interrupt requests, the 82380 
PIC issues an interrupt signal to the 80386. When 
the 80386 host processor responds with an interrupt 
acknowledge signal, the PIC will arbitrate between 
the pending interrupt requests and place the inter- 
rupt vector associated with the highest priority pend- 
ing request on the data bus. 


The major enhancement in the 82380 PIC over the 
82C59A is that each of the interrupt request inputs 


IRQ164- 
IRQ17# 
IRQ18# 
IRQ19# 
IRQ20#- 
IRQ21# 

- 1RQ224 
IRQ23# 


TOUTO# (IROB#) 
DREQ/IRO94 


IRQ11# 
IRQ12# 
IRQ13# 
IRQ14# 
IRQ15# 


Veo TOUT3# (IRQO#) 
CHAINING (IRQ1#) 

Rsp — ICW2 (IRQ 1.5#) 
| (IRQ2#) 


TOUT2$/IRQ3¥4 


WEAK PULL-UP SW Req TC (IRQ4#) 


NOT USED 
NOT USED 


DEFAULT (IRQ7#) 


NOTE: 
Masking IRQ1.5# also masks IRQ2# 


(iRQ104) 
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can be individually programmed with its own inter- 
rupt vector, allowing more flexibility in interrupt vec- 
tor mapping. | 


4.1.1 INTERNAL BLOCK DIAGRAM 


The block diagram of the 82380 Programmable In- 
terrupt Controller is shown in Figure 4-1. Internally, 
the PIC consists of three 82C59A banks: A, B and C. 
The three banks are cascaded to one another: C is 
cascaded to B, B is cascaded to A. The INT output 
of Bank A is used externally to interrupt the 80386. 


Bank A has nine interrupt request inputs (two are 
unused), and Banks B and C have eight interrupt 
request inputs. Of the fifteen external interrupt re- 
quest inputs, two are shared by other functions. Spe- 
cifically, the Interrupt Request 3 input (IRQ3#) can 
be used as the Timer 2 output (TOUT2#). This pin 
can be used in three different ways: IRQ3# input 
only, TOUT2# output only, or using TOUT2# to 
generate an IRQ3# interrupt request. Also, the In- 
terrupt Request 9 input (IRQ 9#) can be used as 


DMA Request 4 input (DREQ4). Typically, only 


IRQ9# or DREQ4 can be used at a time. 


INTERRUPT 
BANK 
Cc 


PNogawn-o 


m4 
INTERRUPT 


BANK 
B 


NOOO GN = O 


wu 


“INTERRUPT | 
BANK © INT 
A (OUTPUT) 


NOU WN =O 
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Figure 4-1. Interrupt Controller Block Diagram 
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4.1.2 INTERRUPT CONTROLLER BANKS 


All three banks are identical, with the exception of 
the IRQ1.5 on Bank A. Therefore, only one bank will 
be discussed. In the 82380 PIC, all external requests 
can be cascaded into and each interrupt controller 
bank behaves like a master. AS compared to the 
82C59A, the enhancements in the banks are: 


— All interrupt vectors are individually programma- 
ble. (In the 82C59A, the vectors must be pro- 
grammed in eight consecutive interrupt vector lo- 
cations.) 


— The cascade address is provided on the Data 
Bus (DO-—D7). (In the 82C59A, three dedicated 
control signals (CASO, CAS1, CAS2) are used for 
master/slave cascading.) 


The block diagram of a bank is shown in Figure 4-2. 
As can be seen from this figure, the bank consists of 
six major blocks: the Interrupt Request Register 
(IRR), the In-Service Register (ISR), the Interrupt 
Mask Register (IMR), the Priority Resolver (PR), the 
Vector Register (VR), and the Control Logic. The 
functional description of each block follows. 


INT. MASK REG. 


INTERRUPT 
TO HOST 


PRIORITY 
RESOLVER 


SERVICE 
REG. 


& 
CONTROL 
LOGIC 


Rar 
ene || SAAD) 
Pros 


INDIVIDUALLY PROGRAMMABLE 
VECTOR BANK 


ea | 
mea 
| IN= 
at 
eee 


DATA (0=7) 


° 
‘ 
a 
¢ 
‘ 
‘ 
D 
‘ 
a 
‘ 
¢ 
‘ 
a 
‘ 
’ 
D 
D 
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‘ 
¢ 
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& 
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82380 ENHANCEMENT OVER THE 82CS9A 
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Figure 4-2. Interrupt Bank Block Diagram 
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INTERRUPT REQUEST ve AND: IN-SERVICE 
REGISTER (ISR) . 


The interrupts at the Interrupt Aegode! (IRQ) i@out 
lines are handled by two registers in cascade, the 
Interrupt Request Register (IRR) and the In-Service 
Register (ISR). The IRR is used to store all interrupt 
levels which are requesting service; and the ISR is 
used to store all interrupt levels which are being 
serviced. 


PRIORITY RESOLVER (PR) 


This logic block determines the priorities of the bits. 


set in the IRR. The highest priority is selected and 
strobed into the corresponding bit of the ISR during 
an Interrupt Acknowledge cycle. 


INTERRUPT MASK REGISTER (IMR) 


The IMR stores the bits which mask the interrupt 
lines to be masked (disabled). The IMR operates on 
the IRR. Masking of a higher priority input will not 
affect the interrupt request lines of lower priority. 


VECTOR REGISTERS (VR) 


This block contains a set. of Vector Registers, one 
for each interrupt request line, to store the pre-pro- 
grammed interrupt vector number. The correspond- 
ing vector number will be driven onto the Data Bus 
of the 82380 during the Interrupt Acknowledge cy- 
Cle. 


CONTROL LOGIC 


The Control Logic coordinates the overall operations 
of the other internal blocks within the same bank. . 
This logic will drive the Interrupt Output signal (INT) 


HIGH when one or more unmasked interrupt inputs 
are active (LOW). The INT output signal goes direct- 
ly to the 80386 (in Bank A) or to another bank to 
which this bank is cascaded (see Figure 4-1). Also, 
this logic will recognize an Interrupt Acknowledge 
cycle (via M/IO#, D/C # and W/R # signals). During 
this bus cycle, the Control Logic will enable the cor- 


responding Vector Register to drive the interrupt | 


vector onto the Data Bus. 


In Bank A, the Control Logic is also responsible for 
handling the special ICW2 interrupt EqueS! input 
(IRQ1 5#). 


82380 


4.2 Interface Signals 


4.2.1 INTERRUPT INPUTS _ 


There are 15 external Interrupt Request inputs and 5 


internal Interrupt Requests. The external request in- 
puts are: IRQ3#, IRQ9#, IRQ11# to IRQ23#. They 
are shown in bold arrows in Figure 4-1. All IRQ in- 
puts are active LOW and they can’be programmed 
(via a control bit in the Initialization Command Word 
1 (ICW1)) to be either edge-triggered or level-trig- 
gered. In order to be recognized as a valid interrupt 
request, the interrupt input must be active (LOW) un- 
til the first INTA# cycle (see Bus Functional De- 
scription). 


Note that all 15 external Interrupt. Request inputs 
have weak internal pull-up resistors. 


As mentioned earlier, an 82C59A can be cascaded 
to each external interrupt input to expand the inter- 
rupt capacity to a maximum of 120 levels. Also, two 
of the interrupt inputs are dual functions: IRQ3 # can 
be used as Timer 2. output (TOUT2#) and IRQ9# 
can be used as DREQ4 input. IRQ3# is a bidirec- - 
tional dual function pin. This interrupt request input is 


-wired-OR with the output of Timer 2 (TOUT2#). If 


only |RQ3# function is to be used, Timer 2 should . 
be programmed so that OUT2 is LOW. Note that 
TOUT2# can also be used to generate an interrupt 
request to IRQ3# input. 


The five internal interrupt requests serve special - 
system functions. They are shown in Table 4-1. The 
following paragraphs describe these interrupts. 


Table 4-1. 82380 Internal Interrupt Requests 


| Interrupt Request Interrupt Source 


IRQ0 # 
IRQ8 # 


Timer 3 Output (TOUTS #) 
Timer 0 Output (TOUTO #) 
DMA Chaining Request 
DMA Terminal Count 
ICW2 Written 


IRQ1 # 
IRQ4# 
IRQ1.5# 


TIMER 0 AND TIMER 3 INTERRUPT REQUESTS 
[IRQ0#] 


IRQ8#: and IRQO# interrupt requests are initiated 
by the output of Timers 0 and 3, respectively. Each 
of these requests is generated by an edge-detector 
flip-flop. The flip-flops are activated by the following 
conditions: 


Set— Rising edge of timer output (TOUT); 


Clear— Interrupt acknowledge for this request; 
OR Request is masked (disabled); OR 
Hardware Reset. 
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CHAINING AND TERMINAL COUNT INTERRUPTS 
[IRQ1#] 


These interrupt requests are generated by the 
82380 DMA Controller. The chaining request 
(IRQ1 #) indicates that the DMA Base Register is 
not loaded. The Terminal Count request (IRQ4#) in- 
dicates that a software DMA request was cleared. 


ICW2 INTERRUPT REQUEST [IRQ1.5#4] 


Whenever an Initialization Control Word 2 (ICW2) is 
written to a Bank, a special I|CW2 interrupt request is 
generated. The interrupt will be cleared when the 
newly programmed !CW2 Register is read. This in- 
terrupt request is in Bank A at level 1.5. This inter- 
rupt request is internally ORed with the Cascaded 
Request from Bank B and is always assigned a high- 
er priority than the Cascaded Request. 


This special interrupt is provided to support compati- 
bility with the original 82C59A. A detailed description 
of this interrupt is discussed in the Programming 
section. 


DEFAULT INTERRUPT [IRQ7#] 


During an Interrupt Acknowledge cycle, if there is no 
active pending request, the PIC will automatically 


PREVIOUS INTERRUPT ACKNOWLEDGE 
CYCLE CYCLE 1 (5 WAIT STATES) 
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generate a default vector. This vector corresponds 
to the IRQ7# vector in Bank A. 


4.2.2 INTERRUPT OUTPUT (INT) 


The INT output pin is taken directly from bank A. 
This signal should be tied to the Maskable Interrupt 
Request (INTR) of the 80386. When this signal is 
active (HIGH), it indicates that one or more internal/ 
external interrupt requests are pending. The 80386 
is expected to respond with an interrupt acknowl- 
edge cycle. 


4.3 Bus Functional Description 


The INT output of bank A will be activated as a result 
of any unmasked interrupt request. This may be a 
non-cascaded or cascaded request. After the PIC 
has driven the INT signal HIGH, 80386 will respond 
by performing two interrupt acknowledge cycles. 
The timing diagram in Figure 4-3 shows a typical in- 
terrupt acknowledge process between the 82380 
and the 80386 CPU. 


INTERRUPT ACKNOWLEDGE 
CYCLE 2. - WAIT pig 
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Figure 4-3. Interrupt Acknowledge Cycle 
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After activating the INT signal, the 82380 monitors 
the status lines (M/lIO#, D/C#, W/R#) and waits 
for the 80386 to initiate the first interrupt acknowl- 
edge cycle. In the 80386 environment, two succes- 
sive interrupt acknowledge cycles (INTA) marked by 
M/IO# = LOW, D/C# = LOW, and W/R# = 
LOW are performed. During the first INTA cycle, the 
PIC will determine the highest priority request. As- 
suming this interrupt input has no external Slave 
Controller cascaded to it, the 82380 will drive the 
Data Bus with OOH in the first INTA cycle. During the 
second INTA cycle, the 82380 PIC will drive the 
Data Bus with the corresponding preprogrammed in- 
terrupt vector. 


If the PIC determines (from the ICW3) that this inter- 
rupt input has an external Slave Controller cascaded 
to it, it will drive the Data Bus with the specific Slave 
Cascade Address (instead of OOH) during the first 
INTA cycle. This Slave Cascade Address is the pre- 
programmed content in the corresponding Vector 
Register. This means that no Slave Address should 
be chosen to be OOH. Note that the Slave Address 
and Interrupt Vector are different interpretations of 
the same thing. They are both the contents of the 
programmable Vector Register. During the second 
INTA cycle, the Data Bus will be floated so that the 
external Slave Controller can drive its interrupt vec- 
tor on the bus. Since the Slave Interrupt Controller 
resides on the system bus, bus transceiver enable 


_and direction control logic must take this into consid- _ 


eration. 


In order to have a successful interrupt service, the 
interrupt request input must be held active (LOW) 


until the beginning of the first interrupt acknowledge — 
cycle. If there is no pending interrupt request when | 


the first INTA cycle is generated, the PIC will gener- 


ate a default vector, which is. the IRQ7 vector (bank 


A level 7). 


According to the Bus Cycle definition of the 80386, 


there will be four Bus Idle States between the two 
interrupt acknowledge cycles. These idle bus cycles 


will be initiated by the 80386. Also, during each inter- _ 


rupt acknowledge cycle, the internal Wait State Gen- 
erator of the 82380 will automatically generate the 
required number of wait states for internal delays. 


4.4 Mode of Operation 


A variety of modes and commands are available for 
controlling the 82380 PIC. All of them are program- 


‘mable; that is, they may be changed dynamically un- 


der software control. In fact, each bank can be pro- 
grammed individually to operate in different modes. 
With these modes and commands, many possible 
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configurations are conceivable, giving the user 
enough versatility for almost any interrupt controlled 
application. 


This section is not intended to show how the 82380 
PIC can be programmed. Rather, it describes the 


operation in different modes. 


4.4.1 END-OF-INTERRUPT 


Upon completion of an interrupt service routine, the 
interrupted bank needs to be notified so its ISR can 
be updated. This allows the PIC to keep track of 
which interrupt levels are in the process of being 
serviced and their relative priorities. Three different 
End-Of-Interrupt (EO!) formats are available. They 
are: Non-Specific EO| Command, Specific EO! Com- 
mand, and Automatic EOI Mode. Selection of which 
EOI to use is dependent upon the interrupt opera- 
tions the t user wishes to perform. 


If the 82380 is NOT programmed in the Automatic 


EOI Mode, an EOI command must be issued by the 
80386 to the specific 82380 PIC Controller Bank. 
Also, if this controller bank is cascaded to another 
internal bank, an EOI command must also be sent to 
the bank to which this bank is cascaded. For exam- 
ple, if an interrupt request of Bank C in the 82380 
PIC is serviced, an EOI! should be written into Bank 


_ C, Bank B and Bank A. If the request comes from an 


external interrupt controller cascaded to Bank C, 
then an EO! should be written into the external con- 
troller as well. 


NON-SPECIFIC EOI COMMAND > 
A Non-Specific EOI command sent from the 80386 


lets the 82380 PIC bank know when a service rou- 
tine has been completed, without specification of its 


- exact interrupt level. The respective interrupt bank 


automatically determines the interrupt level and re- 
sets the correct bit in the ISR. 


To take advantage of the Non-Specific EOI, the in- 
terrupt bank must be in a mode of operation in which 
it can predetermine its in-service routine levels. For 
this reason, the Non-Specific EOI command should 
only be used when the most recent level acknowl- 
edged and serviced is always the highest priority lev- 
el (i.e., in the Fully Nested Mode structure to be de- 
scribed below). When the interrupt bank receives a 
Non-Specific EOI command, it simply resets the 
highest priority ISR bit to indicate that the highest 
priority routine in service is finished. | 


Special consideration should be, taken when decid- 
ing to use the Non-Specific EO! command. Here are 
two operating conditions in which it is best NOT . 
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used since the Fully Nested Mode structure will be 
destroyed: 


-— Using the Set Priority command within an inter- 
rupt service routine. 


— Using a Special Mask Mode. 


These conditions are covered in more detail in their 
own sections, but are listed here for reference. 


SPECIFIC EOI COMMAND 


Unlike a Non-Specific EOI command which automat- 
ically resets the highest priority ISR bit, a Specific 
EOI command specifies an exact ISR bit to be reset. 
Any one of the IRQ levels of an interrupt bank can 
be specified in the command. 


The Specific EO! command is needed to reset the 


ISR bit of a completed service routine whenever the — 


interrupt bank is not able to automatically determine 
it. The Specific EO! command can be used in all 
conditions of operation, including those that prohibit 
Non-Specific EOI! command usage mentioned 
above. | 


AUTOMATIC EO! MODE 


When programmed in the Automatic EOI Mode, the 
80386 no longer needs to issue a command to notify 
the interrupt bank it has completed an interrupt rou- 
tine. The interrupt bank accomplishes this by per- 


forming a Non-Specific EO! automatically at the end 


of the second INTA cycle. 


Special consideration should be taken when decid- 
ing to use the Automatic EOI Mode because it may 
disturb the Fully Nested Mode structure. In the Auto- 
matic EOI Mode, the ISR bit of a routine in service is 
reset right after it is acknowledged, thus leaving no 
designation in the ISR that a service routine is being 
executed. If any interrupt request within the same 
bank occurs during this time and interrupts are en- 
abled, it will get serviced regardless of its priority. 
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Therefore, when using this mode; the 80386 should 
keep its interrupt request input disabled during exe- 
cution of a service routine. By doing this, higher pri- 
ority interrupt levels will be serviced only after the 
completion of a routine in service. This guideline re- 
stores the Fully Nested Mode structure. However, in 
this scheme, a routine in service cannot be interrupt- 
ed since the host’s interrupt request input is dis- 
abled. 


4.4.2 INTERRUPT PRIORITIES 


The 82380 PIC provides various methods for arrang- 
ing the interrupt priorities of the interrupt request in- 
puts to suit different applications. The following sub- 
sections explain these methods in detail. 


4.4.2.1 Fully Nested Mode 


The Fully Nested Mode of operation is a general pur- 
pose priority mode. This mode supports a multi-level 
interrupt structure in which ail of the Interrupt Re- 
quest (IRQ) inputs within one bank are arranged 
from highest to lowest. 


Unless otherwise programmed, the Fully Nested 
Mode is entered by default upon initialization. At this | 
time, IRQO # is assigned the highest priority (priority 
= 0) and IRQ7# the lowest (priority = 7). This de- 
fault priority can be changed, as will be eas 
later in the Rotating Priority Mode. 


When an interrupt is acknowledged, the highest pri- 
ority request is determined from the Interrupt Re- 
quest Register (IRR) and its vector is placed on the 
bus. In addition, the corresponding bit in the In-Serv- 
ice Register (ISR) is set to designate the routine in 
service. This ISR bit will remain set until the 80386 
issues an End Of Interrupt (EO!) command immedi- 
ately before returning from the service routine; or 
alternately, if the Automatic End Of Interrupt (AEOI) 
bit is set, the ISR bit will be reset at the end of the 
second INTA cycle. 


5-1137 


intel 


While the ISR bit is set, all further interrupts of the 
same or lower priority are inhibited. Higher level in- 
terrupts can still generate an interrupt, which will be 
acknowledged only if the 80386 internal interrupt en- 
able flip-flop has been re-enabled (through software 
inside the current service routine). 


4.4.2.2 Automatic Rotation—Equal Priority 
Devices 


Automatic rotation of priorities serves in applications 
where the interrupting devices are of equal priority 
within an interrupt bank. In this kind of environment, 
once a device is serviced, all other equal priority pe- 
ripherals should be given a chance to be serviced 
before the original device is serviced again. This is 
accomplished by automatically assigning a device 
the lowest priority after being serviced. Thus, in the 
worst case, the device would have to wait until all 
other peripherals connected to the same bank are 
serviced before it is serviced again. 


There are two methods of accomplishing automatic 
rotation. One is used in conjunction with the Non- 
Specific EOI| command and the other is used with 
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the Automatic EO! mode. These two methods are 
discussed below. 


ROTATE ON NON-SPECIFIC EO! COMMAND - 


When the Rotate On Non-Specific EO! command is 
issued, the highest ISR bit is reset as in a normal 


_ Non-Specific EO! command. However, after it is re- 


set, the corresponding Interrupt Request (IRQ) level 
is assigned the lowest priority. Other IRQ priorities | 
rotate to conform to the Fully Nested Mode based 
on the newly assigned low priority. 


Figure 4-4 shows how the Rotate On Non-Specific 
EOI command affects the interrupt priorities. As- 
sume the IRQ priorities were assigned with IRQO the 
highest and IRQ7 the lowest. IRQ6 and IRQ4 are 
already in service but neither is completed. Being 
the higher priority routine, |RQ4 is necessarily the 
routine being executed. During the IRQ4 routine, a 
rotate on Non-Specific EO! command is: executed. 
When this happens, Bit 4 in the ISR is reset. IRQ4 
then becomes the lowest Priority and IRQ5 becomes 
the highest. 
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Figure 4-4. Rotate On Non-Specific EOI Command 
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ROTATE ON AUTOMATIC EOI MODE 


The Rotate On Automatic EOI Mode works much 
like the Rotate On Non-Specific EO! Command. The 
main difference is that priority rotation is done auto- 
matically after the second INTA cycle of an interrupt 
- request. To enter or exit this mode, a Rotate-On-Au- 
tomatic-EOI Set Command and Rotate-On-Automat- 
ic-EOI Clear Command is provided. After this mode 
is entered, no other commands are needed as in the 
normal Automatic EO! Mode. However, it must be 
noted again that when using any form of the Auto- 
matic EOI Mode, special consideration should be 
taken. The guideline presented in the Automatic EOI 
Mode also applies here. 


4.4.2.3 Specific Rotation—Specific Priority 


Specific rotation gives the user versatile capabilities 
in interrupt controlled operations. It serves in those 
applications in which a specific device’s interrupt pri- 
ority must be altered. As opposed to Automatic Ro- 
tation which will automatically set priorities after 
each interrupt request is serviced, specific rotation is 
completely user controlled. That is, the user selects 
which interrupt level is to receive the lowest or the 


highest priority. This can be done during the main | 


82380 


program or within interrupt routines. Two specific ro- 
tation commands are available to the user: Set Prior- 
ity Command and Rotate On Specific EOI Com- 
mand. 


SET PRIORITY COMMAND 


The Set Priority Command allows the programmer to 
assign an IRQ level the lowest priority. All other in- 
terrupt levels will conform to the Fully Nested Mode 
based on the newly assigned low priority. 


ROTATE ON SPECIFIC EO| COMMAND 


The Rotate On Specific EO| Command is literally a 
combination of the Set Priority Command and the 
Specific EO] Command. Like the Set Priority Com- 
mand, a specified IRQ level is assigned lowest priori- 
ty. Like the Specific EO| Command, a specified level 
will be reset in the ISR. Thus, this command accom- 
plishes both tasks in one single command. 


4.4.2.4 Interrupt Priority Mode Summary 


In order to simplify understanding the many modes 
of interrupt priority, Table 4-2 is provided to bring out 
their summary of operations. 


Table 4-2. interrupt Priority Mode Summary 


IRQO # -Highest Priority 
IRQ7 #-Lowest Priority 


Interrupt level just serviced 
(Equal Priority Devices)|is the lowest priority. Other 
priorities rotate to conform 
to Fully-Nested Mode. 


User specifies the lowest 
priority level. Other priorities 
rotate to conform to Fully- 
_|Nested Mode. 


Fully-Nested Mode 


s 


Specific Rotation 
(Specific Priority 
Devices) 


Effect On Priority After EO 
Non-Specific/ Automatic | Specific | 


No change in priority. Not Applicable. 
Highest ISR bit is reset. 


Highest ISR bit is reset and the 
corresponding level becomes the 
lowest priority. 


interrupt Operation 
Priority Mode _ Summary 


Not Applicable. 


Not Applicable. 


As described under 
‘Operation Summary’. 
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4.4.3 INTERRUPT MASKING © 
VIA INTERRUPT MASK REGISTER 


Each bank in the 82380 PIC has an Interrupt Mask 
Register (IMR) which enhances interrupt control ca- 
pabilities. This IMR allows individual IRQ masking. 
When an IRQ is masked, its interrupt request is dis- 
abled until it is unmasked. Each bit in the 8-bit IMR 
disables one interrupt channel if it is set (HIGH). Bit 
0 masks IRQO, Bit-1 masks IRQ1 and so forth. 
Masking an IRQ channel will only disable the corre- 
sponding channel and does not affect the others op- 
erations. 


The IMR acts only on the output of the IRR. That is, 
if an interrupt occurs while its IMR bit is set, this 
request is not ‘forgotten’. Even with an IRQ input 
masked, it is still possible to set the IRR. Therefore, 
when the IMR bit is reset, an interrupt request to the 
80386 will then be generated, providing that the IRQ 
request remains active. If the IRQ request is re- 
moved before the IMR is reset, the Default Interrupt 
Vector (Bank A, level 7) will be generated during the 
interrupt acknowledge cycle. 


SPECIAL MASK MODE 


In the Fully Nested Mode, all IRQ levels. of lower 
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4.4.4 EDGE OR LEVEL INTERRUPT 
TRIGGERING 


Each bank in the 82380 PIC can be programmed 
independently for either edge or level sensing for the 
interrupt request signals. Recall that all IRQ inputs 
are active LOW. Therefore, in the edge triggered © 
mode, an active edge is defined as an input tran- 


_ sition from an inactive (HIGH) to active (LOW) state. 


The interrupt input may remain active without gener- 
ating another interrupt. During level triggered mode, 
an interrupt request will be recognized by an active 
(LOW) input, and there is no need for edge detec- 
tion. However, the interrupt request must be re- 
moved before the EOI Command is issued, or the 
80386 must be disabled to prevent a second false 
interrupt from occurring. 


In either modes, the interrupt request input must be 
active (LOW) during the first INTA cycle in order to 
be recognized. Otherwise, the Default Interrupt Vec- 
tor will be generated at level 7 of Bank A. 


4.4.5 INTERRUPT CASCADING 


As mentioned previously, the 82380 allows for exter- 
nal Slave interrupt controllers to be cascaded to any 


_ of its external interrupt request pins. The 82380 PIC 


priority than the routine in service are inhibited. How- 


ever, in some applications, it may be desirable to let 
a lower priority interrupt request to interrupt the rou- 


indicates that a external Slave Controller is to be 
serviced by putting the contents of the Vector Regis- 


_ ter associated with the particular request on the 


tine in service. One method to achieve this is by 


using the Special Mask Mode. Working in conjunc- 
tion with the IMR, the Special Mask Mode enables 
interrupts from all levels except the level in service. 
This is usually done inside an interrupt service rou- 
tine by masking the level that is in service and then 
issuing the Special Mask Mode Command. Once the 
Special Mask Mode is enabled, it remains in effect 
until it is disabled. 


DATA BUS 


INTA# 
(FROM BUS CONTROLLER) 


80386 Data Bus during the first INTA cycle (instead 
of OOH during a non-slave service). The external log- 
ic should latch the vector on the Data Bus using the 
INTA status signals and use it to select the external 
Slave Controller to be serviced (see Figure 4-5). The 
selected Slave will then respond to the second INTA 


cycle and place its vector on the Data Bus. This 


method requires that if external Slave Controllers 
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Figure 4-5. Slave Cascade Address Capturing 
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are used in the system, no vector should be pro- 
grammed to OOH. 


Since the external Slave Cascade Address is provid- 
ed on the Data Bus during INTA cycle 1, an external 
latch is required to capture this address for the Slave 
Controller. A simple scheme is depicted in Figure 
4-5. 


4.4.5.1 Special Fully Nested Mode 


This mode will be used where cascading is em- 
ployed and the priority is to be conserved within 
each Slave Controller. The Special Fully Nested 
Mode is similar to the ‘regular’ Fully Nested Mode 
with the following exceptions: 


— When an interrupt request from a Slave Control- 
ler is in service, this Slave Controller is not 
locked out from the Master’s priority logic. Fur- 
ther interrupt requests from the higher priority 
logic within the Slave Controller will be recog- 
nized by the 82380 PIC and will initiate interrupts 


to the 80386. In comparing to the ‘regular’ Fully — 


Nested Mode, the Slave Controller is masked out 
when its request is in service and no higher re- 
quests from the same Slave Controller can be 
serviced. 


— Before exiting the interrupt service routine, the 
software has to check whether the interrupt serv- 
iced was the only request from the Slave Con- 
troller. This is done by sending a Non-Specific 
EOI Command to the Slave Controller and then 
reading its In Service Register. If there are no 


requests in the Slave Controller, a Non-Specific 


EOI can be sent to the corresponding 82380 PIC 
bank also. Otherwise, no EO! should be sent. 


4.4.6 READING INTERRUPT STATUS 


The 82380 PIC provides several ways to read differ- 
ent status of each interrupt bank for more flexible 
interrupt control operations. These include polling 
the highest priority pending interrupt request and 
reading the contents of different interrupt status reg- 
isters. 


4.4.6.1 Poll Command 


The 82380 PIC supports status polling operations 
with the Poll Command. In a Poll Command, the 
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pending interrupt request with the highest priority 
can be determined. To use this command, the INT 
output is not used, or the 80386 interrupt is disabled. 
Service to devices is achieved by software using the 
Poll Command. 


This mode is useful if there is a routine command 
common to several levels so that the INTA se- 
quence is not needed. Another application is to use 
the Poll Command to expand the number of priority 
levels. 


Notice that the |[CW2 mechanism is not supported 
for the Poll Command. However, if the Poll Com- 
mand is used, the programmable Vector Registers 
are of no concern since no INTA cycle will be gener- 
ated. 


4.4.6.2 Reading Interrupt Registers 


The contents of each interrupt register (IRR, ISR, 
and IMR) can be read to update the user’s program 
on the present status of the 82380 PIC. This can be 
a versatile tool in the decision making process of a 
service routine, giving the user more control over 
interrupt operations. 


The reading of the IRR and ISR contents can be 
performed via the Operation Control Word 3 by us- 
ing a Read Status Register Command and the con- 
tent of IMR can be read via a simple read operation 
of the register itself. 


4.5 Register Set Overview 


Each bank of the 82380 PIC consists of a set of 8-bit 
registers to control its operations. The address map 
of all the registers is shown in Table 4-3. Since all 
three register sets are identical in functions, only 
one set will be described. 


Functionally, each register set can be divided into 
five groups. They are: the four Initialization Com- 
mand Words (ICW’s), the three Operation Control 
Words (OCW’s), the Poll/Interrupt Request/In-Serv- 
ice Register, the Interrupt Mask Register, and the 
Vector Registers. A description of each group fol- 
lows. 
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Table 4-3. Interrupt Controller Register Address Map 


- Port 7 : = ae 


Write Bank B ICW1, OCW2, or ocw3 
Read Bank B Poll, Request or In-Service 


Status Register 
Write — Bank B ICW2, ICW3, ICW4, OCW1 
Read Bank B Mask Register 
Read Bank B ICW2 
Read/Write | IRQ8 Vector Register 
Read/Write | !IRQ9 Vector Register 
Read/Write | Reserved 
Read/Write | IRQ11 Vector Register 
Read/Write | IRQ12 Vector Register 
Read/Write | IRQ13 Vector Register 
Read/Write | IRQ14 Vector Register 
Read/Write | IRQ15 Vector Register 


Write | 
Read. 


Bank C ICW1, OCW2, or OCW3 
Bank C Poll, Request or In-Service 


, Status Register 

Write Bank C ICW2, ICW3, ICW4, OCW1 
Read Bank C Mask Register 
Read Bank C ICW2 ~ 
Read/Write | IRQ16 Vector Register 
Read/Write | IRQ17 Vector Register 
Read/Write | IRQ18 Vector Register 
Read/Write | IRQ19 Vector Register 
Read/Write | IRQ20 Vector Register 
Read/Write | IRQ21 Vector Register 
Read/Write | IRQ22 Vector Register 
Read/Write |; IRQ23 Vector Register 


Write 
Read 


Bank A ICW1, OCW2, or OCW3 
Bank A Poll, Request or In-Service 


: Status Register . | 
Write Bank A ICW2, ICW3, ICW4, Ocw1 
Read ‘Bank A Mask Register : 

_ Read Bank ICW2_ 
Read/Write | IRQO Vector Register 
Read/Write | IRQ1 Vector Register 
Read/Write | IRQ1.5 Vector Register 
Read/Write | IRQ3 Vector Register 
Read/Write | IRQ4 Vector Register 
Read/Write | Reserved 
Read/Write | Reserved 


Read/Write | IRQ7 Vector Register 


5-1142 


intel 


4.5.1 INITIALIZATION COMMAND WORDS (ICW) 


Before normal operation can begin, the 82380 PIC 
must be brought to a known state. There are four 
8-bit Initialization Command Words in each interrupt 
bank to setup the necessary conditions and modes 
for proper operation. Except for the second common 
word (ICW2) which is a read/write register, the other 
three are write-only registers. Without going into de- 
tail of the bit definitions of the command words, the 
following subsections give a brief description of what 
functions each command word controls. 


ICW1 


The |CW1 has three major functions. They are: 


— To select between the two IRQ input triggering 
modes (edge-or level-triggered); 


— To designate whether or not the interrupt bank is 
to be used alone or in the cascade mode. If the 
cascade mode is desired, the interrupt bank will 
accept ICWS for further cascade mode program- 
ming. Otherwise, no ICW3 will be accepted; 


— To determine whether or not ICW4 will be issued; 
that is, if any of the ICW4 operations are to be 
used. 


ICW2 


ICW2 is provided for compatibility with the 82C59A 
only. Its contents do not affect the operation of the 
interrupt bank in any way. Whenever the |[CW2 of 
any of the three banks is written into, an interrupt is 
generated from Bank A at level 1.5. The interrupt 
request will be cleared after the ICW2 register has 
been read by the 80386. The user is expected to 
program the corresponding vector register or to use 
it as an indicator that an attempt was made to alter 
the contents. Note that each |CW2 register has dif- 
ferent addresses for read and write operations. 


ICW3 


The interrupt bank will only accept an ICW3 if pro- 
grammed in the external cascade mode (as indicat- 
ed in ICW1). ICW3 is used for specific programming 
within the cascade mode. The bits in ICW3 indicate 
which interrupt request inputs have a Slave cascad- 
ed to them. This will subsequently affect the inter- 
rupt vector generation during the interrupt acknowl- 
edge cycles as described previously. 


ICW4 
| The ICW4 is accepted only if it was selected in 


ICW1. This command word register serves two func- 
tions: 
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— To select either the Automatic EO! mode or soft- 
ware EO! mode; 


— To select if the Special Nested mode is to be 
used in conjunction with the cascade mode. 


4.5.2 OPERATION CONTROL WORDS (OCW) 


Once initialized by the ICW’s, the interrupt banks will 
be operating in the Fully Nested Mode by default 
and they are ready to accept interrupt requests. 
However, the operations of each interrupt bank can 
be further controlled or modified by the use of 
OCW’s. Three OCW’s are available for programming 
various modes and commands. Note that all OCW’s 
are 8-bit write-only registers. 


The modes and operations controlled by the OCW’s 
are: 

— Fully Nested Mode; 

— Rotating Priority Mode; 

— Special Mask Mode; 

— Poll Mode; 

— EOI! Commands; 

— Read Status Commands. 


OCW 1 


OCW1 is used solely for masking operations. It pro- 
vides a direct link to the Interrupt Mask Register 
(IMR). The 80386 can write to this OCW register to 
enable or disable the interrupt inputs. Reading the 
pre-programmed mask can be done via the Interrupt 
Mask Register which will be discussed shortly. 


OCW2 


OCW2 is used to select End-Of-interrupt, Automatic 
Priority Rotation, and Specific Priority Rotation oper- 
ations. Associated commands and modes of these 
operations are selected using the different combina- 
tions of bits in OCW2. | 


Specifically, the OCW2 is used to: 


— Designate an interrupt level (0-7) to be used to 
reset a specific ISR bit or to set. a specific priori- 
ty. This function can be enabled or disabled; — 


— Select which software EOI command (if any) is to 
be executed (i.e., Non-Specific or Specific EOI); 


— Enable one of the priority rotation operations 
(i.e., Rotate On Non-Specific EOI, Rotate On Au- 
tomatic EOI, or Rotate on Specific EOl). 


OCW3 


There are three main categories of operation that 
OCWS3 controls. That are summarized as follows: 
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— To select and execute the Read Status Register 
Commands, either reading the Interrupt Request 
Register (IRR) or the In-Service’ Register (ISR); 

— To issue the Poll Command. The Poll Command 
will override a Read Register Command if both 
functions are enabled simultaneously; 


— To set or reset the Special Mask Mode. 


4.5.3 POLL/INTERRUPT REQUEST/IN-SERVICE 
STATUS REGISTER 


As the name implies, this 8-bit read-only register has 
multiple functions. Depending on the command is- 
sued in the OCW3, the content of this register re- 
flects the result of the command executed. For a 
Poll Command, the register read contains the binary 
code of the highest priority level requesting service 
(if any). For a Read IRR Command, the register con- 


tent will show the current pending interrupt re- 


quest(s). Finally, for a Read ISR Command, this reg- 
ister will specify all interrupt levels which are being 
serviced. 


4.5.4 INTERRUPT MASK REGISTER (IMR) 


This is a read-only 8-bit register which, when read, 
will specify all interrupt levels within the same bank 
that are masked. 


4.5.5 VECTOR REGISTER (VR) 


Each interrupt request input has an 8-bit read/write 
programmable vector register associated with it. The 
registers should be programmed to contain the inter- 
rupt vector for the corresponding request. The con- 
tents of the Vector Register will be placed on the 
Data Bus during the INTA cycles as described previ- 
ously. 


4.6 Programming 


Programming the 82380 PIC is accomplished by us- 
ing two types of command words: ICW’s and 
OCW’s. All modes and commands explained in the 
previous sections are programmable using the 
ICW’s and OCW’s. The ICW’s are issued from the 
80386 in a sequential format and are used to setup 
the banks in the 82380 PIC in an initial state of oper- 


ation. The OCW’s are issued as needed to vary and | 


control the 82380 PIC’s operations. 


Both ICW’s and OCW’s are sent by the 80386 to the 
interrupt banks via the Data Bus. Each bank distin- 
guishes between the different ICW’s and OCW’s by 
the |/O address map, the sequence they are issued 
(ICW’s only), and by some dedicated bits among the 
ICW’s and OCW’s. 
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All three interrupt banks are programmezd in a similar 
way. Therefore, only a single bank will be described. 


4.6.1 INITIALIZATION (ICW). 


Before normal operation can begin, each bank must 
be initialized by programming a sequence of two to 
four bytes written into the ICW's. 


Figure 4-6 shows the initialization flow for an inter- 
rupt bank. Both ICW1 and |ICW2 must be issued for 
any form of operation. However, ICW3 and ICW4 are 
used only if designated in ICW1. Once initialized, if 
any programming changes within the ICW’s are to 
be made, the entire ICW sequence must be repro- 
grammed, not just an individual ICW. 


Note that although the ICW2’s in the 82380 PIC do 
not affect the Bank’s operation, they still must be 
programmed in order to preserve the compatibility 
with the 82C59A. The contents programmed are not 
relevant to the overall operations of the interrupt 
banks. Also, whenever one of the three ICW2’s is 
programmed, an interrupt level 1.5 in Bank A will be 
generated. This interrupt request will be cleared 
upon reading of the ICW2 registers. Since the three 
ICW2’s share the same interrupt level and the sys- 
tem may not know the origin of the interrupt, all three 
ICW2’s must be read. 


However, it is not necessary to provide an interrupt 
service routine for the |CW2 interrupt. One way to 
avoid this is as follows. At the beginning of the initial- 
ization of the interrupt banks, the 80386 interrupt 
should be disabled. After each ICW2 register write 
operation is performed during the initialization, the 
corresponding ICW2 register is read. This read oper- 
ation will clear the interrupt request of the 82380. At 
the end of the initialization, the 80386 interrupt is re- 
enabled. With this method, the 80386 will not detect 


_ the ICW2 interrupt request, thus eliminating the need 


of an interrupt service routine. 


Certain internal setup conditions occur automatically 
within the interrupt bank after the first ICW (ICW1) 
has been issued. They are: 


— The edge sensitive circuit is reset, which means 
that following initialization, an interrupt request 
input must make a HIGH-to-LOW transition to 
_ generate an interrupt; , 


— The Interrupt Mask Register (IMR) is cleared: 
that is, all interrupt inputs are enabled; 
— IRQ7 input of each bank is assigned priority 7 
(lowest); 

— Special Mask Mode is cleared and. Status Read 
is set to IRR; 


— If no ICW4 is needed, then no Automatic-EO! is 
selected. 
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DISABLE INTERRUPT 


PROGRAM VECTOR(S) * 


EXTERNAL 
CASCADE 
MODE 


NO (IC4=0) 


. ENABLE INTERRUPT 


READY TO ACCEPT 
INTERRUPT REQUESTS 


*ICW2 vector address must be programmed now. 


(ICW2 INTERRUPT GENERATED) 


(ALLOW SERVICING 
OF ICW2 INTERRUPT) © 


290128-—55 


Other vector addresses may be programmed via ICW2 interrupt service routine. 


Figure 4-6. Initialization Sequence 


4.6.2 VECTOR REGISTERS (VR) 


Each interrupt request input has a separate Vector 
Register. These Vector Registers are used to store 
the pre-programmed vector number corresponding 
to their interrupt sources. In order to guarantee prop- 
er interrupt handling, all Vector Registers must be 
programmed with the predefined vector numbers. 
Since an interrupt request will be generated whenev- 
er an ICW2 is written during the initialization se- 
quence, it is important that the Vector Register of 
IRQ1.5 in Bank A should be initialized and the inter- 
rupt service routine of this vector is set up before the 
ICW’s are written. 


4.6.3 OPERATION CONTROL WORDS (OCW) 


After the ICW’s are programmed, the operations of 
each interrupt controller bank can be changed by 
writing into the OCW’s as explained before. There is 
no special programming sequence required for the 
OCW’s. Any OCW may be written at any time in or- 
der to change the mode of or to perform certain op- 
erations on the interrupt banks. : 


4.6.3.1 Read Status and Poll Commands (OCW3) 


Since the reading of IRR and ISR status as well as 
the result of a Poll Command are available on the 
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same read-only Status Register, a special Read 
Status/Poll Command must be issued before the 
Poll/Interrupt Request/In-Service Status Register is 
read. This command can be specified by writing the 
required control word into OCWS3. As mentioned ear- 
lier, if both the Poll Command and the Status Read 
Command are enabled simultaneously, the Poll 
Command will override the Status Read. That is, af- 
ter the command execution, the Status Register will 
contain the result of the Poll Command. 


Note that for reading IRR and ISR, there is no need 
to issue a Read Status Command to the OCW3 ev- 
ery time the IRR or ISR is to be read. Once a Read 


4.7 Register Bit Definition 


INITIALIZATION COMMAND WORD 1 (ICW1) 


_ INITIALIZATION COMMAND WORD 2 (ICW2) 


| D7 D6 D5 D4 D3 | D2 D1 DO 
Pe ee 

O=- EDGE TRIGGERED | 

1—- LEVEL TRIGGERED 
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Status Command is received by the interrupt bank, it 
‘remembers’ which register is selected. However, 
this is not true when the Poll Command is used. — 


-In the Poll Command, after the OCW3 is written, the 


82380 PIC treats the next read to the Status Regis- 
ter as an interrupt acknowledge. This will set the ap- 


_ propriate IS bit if there is a request and read the 


priority level. Interrupt Request input status remains 
unchanged from the Poll Command to the Status 
Read. | 


In addition to the above read commands, the Inter- 
rupt Mask Register (IMR) can also be read. When 
read, this register reflects the contents of the pre- 
programmed OCW1 which contains information on 
which interrupt request(s) is(are) currently disabled. 


| O~ NO ICW4 NEEDED 
1=— ICW4 NEEDED 


O= EXTERNAL CASCADE 

(ICW3 NEEDED) 
1 = NO EXTERNAL CASCADE 
(ICW3 NOT NEEDED) 


290128-56 


CONTENT IS NOT RELEVANT TO THE ACTUAL 
OPERATION OF THE BANK BUT CAN BE READ 
BY THE INTERRUPT SERVICE ROUTINE TO 
DETERMINE WHERE THE INTERRUPT VECTORS © 
OF EACH BANK START. 


290128-57 
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INITIALIZATION COMMAND WORD 3 (ICW3) 
ICW3 for Bank A: 


O-NO SLAVE CASCADED TO BANK A 
1—- THERE IS A SLAVE CASCADED TO 


TOUT2#/IRQ3# PIN 
#/ 290128-B4 


ICW3 for Bank B: 


BO EQES So 60 ee 
ee ame (ee 


O=-NO CASCADED REQUEST TO IRQn 

1 = THERE IS A CASCADED REQUEST 
CONNECTED TO IRQn (i.e. THE 
CORRESPONDING INTERRUPT 


REQUEST INPUTS) 
290128-B5 


ICW3 for Bank C: 


O=-NO CASCADED REQUEST TO IRQn 
1=— THERE IS A CASCADED REQUEST | 
CONNECTED TO IRQn 


290128-B6 


INITIALIZATION COMMAND WORD 4 (ICW4) 


Pee Le Sy x Peo 


0 =NORMAL EOI 
1 =AUTOMATIC EO! 


0 =NOT SPECIAL FULLY NESTED MODE 
1 = SPECIAL FULLY NESTED MODE 


290128-58 


OPERATION CONTROL WORD 1 (OCW1) 


Mi=1 MASK SET (INTERRUPT DISABLE) 
Mi=0O MASK RESET (INTERRUPT ENABLE) 


290128-59 
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OPERATION CONTROL WORD 2 (OCW2) . 


INTERRUPT LEVEL 


NON=SPECIFIC EO| COMMAND TO BE ACTED UPON 


SPECIFIC EO! COMMAND (L2=L0O USED) 
ROTATE ON NON=SPECIFIC EOI 

ROTATE ON AUTO=EOI MODE (SET) 
ROTATE ON AUTO=EOI MODE (CLEAR) 
ROTATE ON SPECIFIC EOI (L2—LO USED) 
SET PRIORITY (L2—LO USED) 

NO OPERATION 


290128-60 


OPERATION CONTROL WORD 3 (OCW3) 


D7 D6 =—ss«é@Ss D4 D3 D2 D1 DO 
jo fesww | sim | oo fos fe fe | is 
a a es 


ra 
Hn . 


ESMM SMM 


RR 
0 0 NO ACTION 0 0 - NO ACTION 
0 1 NO ACTION 1 = POLL COMMAND 0 1 NO ACTION 
1 O RESET SPECIAL MASK O= NO POLL COMMAND 1 0 READ IR REG. (STATUS) 
1 1 SET SPECIAL MASK 1 1. READ IS REG. (STATUS) 


290128-61 
ESMM—Enable Special Mask Mode. When this bit is set to 1, it enables the SMM bit to set or reset the Special Mask 
Mode. When this bit is set to 0, SMM bit becomes don’t care. 


SMM—Special Mask Mode. If ESMM = 1 and SMM = 1, the interrupt controller bank will enter Special Mask Mode. If 
ESMM = 1 and SMM = 0, the bank will revert to normal mask mode. When ESMM = 0, SMM has no effect. 


Poll/Interrupt Request/In-Service Status Register 


POLL COMMAND STATUS 


D7 D6 DS D4 D3 D2 ODI oO 
BE RSESES ESCA 
BINARY CODE OF 
THE HIGHEST PRIORITY 
LEVEL REQUESTING 


O-NO PENDING INTERRUPT 
1 = PENDING INTERRUPT 


290128-62 
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INTERRUPT REQUEST STATUS 


D7 D6 DS D4 D3 D2 ODI DO 
IRQ7 ROS IROS | IRO4| IRO3 | IROZ 1RQ1 | RAO 


IF IRQ BIT IS: O— NO REQUEST 


1 — REQUEST PENDING 
290128-63 


NOTE: 

Although all Interrupt Request inputs are active LOW, the internal logical will invert the state of the pins so that when 
there is a pending interrupt request at the input, the corresponding IRQ bit will be set to HIGH in the Interrupt Request 
Status register. | 


IN-SERVICE STATUS VECTOR REGISTER (VR) 


Pinot MI [or [os [os [oe Pos or [or] oo 
ooo ed 


IF IS, BIT IS: O= NOT IN-SERVICE 
1 =~ REQUEST IS IN-SERVICE 


290128-64 


8=BIT VECTOR NUMBER 


290128-65 


4.8 Register Operational Summary 


For ease of reference, Table 4-4 gives a summary of the different operating modes and commands with their 
corresponding registers. 


Table 4-4 Register Operational Summary 


Operational Command 
Description Words 


Fully Nested Mode OCW-Default — 


Non-specific EO] Command OCW2 EOI 

Specific EO| Command OCW2 SL, EOI, 
| LO-L2 

Automatic EOI Mode ICW1, ICW4 IC4, AEOI 


Rotate On Non-Specific ocwe2 EO! 


EOI Command 

Rotate On Automatic OCW2 R, SL, EOI 
EOI Mode 

Set Priority Command OCW2 LO-L2 

Rotate On Specific OCW2 R, SL, EOI 
EOI Command | 

Interrupt Mask Register OCW1 MO-M7 


Special Mask Mode OCW3 ESMM, SMM 
Level Triggered Mode ICW1 LTIM 
Edge Triggered Mode ICW1 LTIM 
Read Register Command, IRR OCW3 RR, RIS 
Read Register Command, ISR OCW3 RR, RIS 
Red IMR IMR MO-M7 
Poll Command OCW3 P 
Special Fully Nested Mode IC4, SFNM 


ICW2, ICW4 
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5.0 PROGRAMMABLE INTERVAL 
TIMER 


5.1 Functional Description 


The 82380 contains four independently Programma- 
ble Interval Timers: Timer O-3. All four timers are 
functionally compatible to the Intel 82C54. The first 
three timers (Timer O-2) have specific functions. 
The fourth timer, Timer 3, is a general purpose timer. 
_ Table 5-1 depicts the functions of each timer. A brief 
description of each timer’s function follows. 


Table 5-1. Programmable 
Interval Timer Functions 


Event Based 
IRQ8 Generator 
Gen. Purpose/DRAM 


Refresh Req. 
TOUT2# /IRQ3# | Gen. Purpose/Speaker 
Out/IRQ3 # 
Gen. Purpose/IRQO 
Generator 


TOUT3 # 


DATA BUFFER 


& 
8=BIT 
INTERNAL BUS a 


CONTROL 
WORD 
REGISTER | 


CONTROL 


WORD | 
"REGISTER II COUNTER 3 Oe 


82380 


TIMER o> Event Based IRQ8 Generator 


Timer 0. is intended to be used as an Event Counter. 
The output of this timer will generate an Interrupt 
Request 8 (IRQ8) upon a rising edge of the timer 
output (TOUTO). Typically, this timer is used to im- 
plement a time-of-day clock or system tick. The Tim- 


er 0 output is not available as an external signal. 


TIMER 1— General Purpose/DRAM Refresh 
Request 


The output of Timer 1, TOUT1, can be used as a 
general purpose timer or as a DRAM Refresh Re- 
quest signal. The rising edge of this output creates a 
DRAM refresh request to the 82380 DRAM Refresh 
Controller. Upon reset, the Refresh Request func- 
tion is disabled, and the output pin is the Timer 1 
output. ? 


TIMER 2—General Purpose/Speaker Out/IRQ3# 


The Timer 2 output, TOUT2#, could be used to sup- 
port tone generation to an external speaker. This pin 
is a bidirectional signal. When used as an input, a 
logic LOW asserted at this pin will generate an Inter- 
rupt Register 3 (IRQ3#) (see Programmable Inter- 
rupt Controller). | - 


f ae | IRQ8 
, (INTERNAL) 
DETECTOR ae SANICE 
REFRESH 
CONTROLLER 


REF# 


x EDGE 
DETECTOR 


2=70= 1 
4 MYA TOUT! /REF# 


0 select 


REF ENABLE 
OPEN COLLECTOR (INTERNAL) 


i ° TOUT2# /IRQ34 


TO IRQ3# (INTERNAL) 
BANK A 


OUT3 i EDGE IRQO 


DETECTOR (INTERNAL) 


BANK A 


ie: ° TOUT34 


290128-66 


Figure 5-1. Block Diagram of Programmable Interval Timer 
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TIMER 3—General Purpose/Interrupt Request 0 
Generator 


The output of Timer 3 is fed to an edge detector and 
generates an Interrupt Request 0 (IRQO) in the 
82380. The inverted output of this timer (TOUT3 #) 
is also available as an external signal for general 
purpose use. 


5.1.1 INTERNAL ARCHITECTURE 


The functional block diagram of the Programmable 
Interval Timer section is shown in Figure 5-1. Follow- 
ing is a description of each block. 


DATA BUFFER & READ/WRITE LOGIC 


This part of the Programmable Interval Timer is used 
to interface the four timers to the 82380 internal bus. 
The Data Buffer is for transferring commands and 
data between the 8-bit internal bus and the timers. 


82380 


The Read/Write Logic accepts inputs from the inter- 
nal bus and generates signals to control other func- 
tional blocks within the timer section. 


CONTROL WORD REGISTERS | & II 


The Control Word Registers are write-only registers. 
They are used to control the operating modes of the 
timers. Control Word Register | controls Timers 0, 1 
and 2, and Control Word Register Il controls Timer 
3. Detailed description of the Control Word Regis- 
ters will be included in the Register Set Overview 
section. 


COUNTER 0, COUNTER 1, 


~ COUNTER 2, COUNTER 3 


Counters 0, 1, 2, and 3 are the major parts of Timers 
0, 1, 2, and 3, respectively. These four functional 
blocks are identical in operation, so only a single 
counter will be described. The internal block dia- 
gram of one counter is shown in Figure 5-2. 


INTERNAL BUS 


ct 


CONTROL aren 
WwORD aren 
REGISTER 


STATUS 
ay 
ber eae 
tl 


ie 
CONTROL oe ae 


LOGIC 


ES 
alae 


A 


GATEny 
CLKn OUTh 
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Figure 5-2. Internal Block Diagram of A Counter 
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The four counters share a common clock input 
(CLKIN), but otherwise are fully independent. Each 
counter is programmable to operate in a different 
Mode. . 


Although the Control Word Register is shown in the | 


Figure 5-2, it is not part of the counter itself. Its pro- 
grammed contents are used to eonuol the opera- 
tions of the counters. | 


The Status Register, when latched, contains the cur- 
rent contents of the Control Word Register and 
status of the output and Null Count Flag (see Read 
Back Command). | 


The Counting Element (CE) is the actual counter. It 
is a 16-bit presettable synchronous down counter. 


The Output Latches (OL) contain two 8-bit latches 
(OLM and OLL). Normally, these latches ‘follow’ the 
content of the CE. OLM contains the most signifi- 
cant byte of the counter and OLL contains the least 
significant byte. If the Counter Latch Command is 
sent to the counter, OL will latch the present count 
until read by the 80386 and then return to follow the 
CE. One latch at a time is enabled by the timer’s 
Control Logic to drive the internal bus. This is how 
the 16-bit Counter communicates over the 8-bit in- 
ternal bus. Note that CE cannot be read. Whenever 


the count is read, it is one of the OL’s that is being 


read. 


When a new count is written into the counter, the 


value will be stored in the Count Registers (CR), and 
transferred to CE. The transferring of the contents 
from CR’s to CE is defined as ‘loading’ of the coun- 
ter. The Count Register contains two 8-bit registers: 
CRM (which contains the most significant byte) and 
CRL (which contains the least significant byte). Simi- 
lar to the OL’s, the Control Logic allows one register 
at a time to be loaded from the 8-bit internal bus. 


However, both bytes are transferred from the CR’s 


to the CE simultaneously. Both CR’s are cleared 
when the Counter is programmed. This way, if the 
Counter has been programmed for one byte count 


(either the most significant or the least significant — 


byte only), the other byte will be zero. Note that CE 
cannot be written into directly. Whenever a count is 
written, it is the CR that is being written. 


As shown in the diagram, the Control Logic consists 
of three signals: CLKIN, GATE, and OUT. CLKIN 
and GATE will be discussed in detail in the section 
that follows. OUT is the internal output of the coun- 
ter. The external outputs of some timers (TOUT) are 
the inverted version of OUT (see TOUT1, TOUT2#, 
TOUT3#). The state of OUT depends on the mode 
of operation of the timer. 
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5.2 Interface Signals 


5.2.1 CLKIN 


CLKIN is an input signal used by all four timers for 
internal timing reference. This signal can be inde- 
pendent of the 82380 system clock, CLK2. In the 
following discussion, each ‘CLK Pulse’ is defined as 
the time period between a rising edge and a falling 
edge, in that order, of CLKIN. 


During the rising edge of CLKIN, the state of GATE 
is sampled. All new counts are loaded and counters 
are decremented on the falling edge of CLKIN. 


Please note that there are restrictions on the CLKIN 
signal during WRITE cycles to the 82380 timer unit. 
Refer to the appendix of this data manual for details 
on this issue. 


5.2.2 TOUT1, TOUT2#, TOUT3# 


TOUT1, TOUT2# and TOUT3# are the external 
output signals of Timer 1, Timer 2 and Timer 3, re- 
spectively. TOUT2# and TOUT3# are the inverted 
signals of their respective counter outputs, OUT. 
There is no external output for Timer 0. 


If Timer 2 is to be used as a tone generator of a 
speaker, external buffering must be used to provide 
sufficient drive capability. 


The Outputs of Timer 2 and 3 are dual function pins. 
The output pin of Timer 2 (TOUT2# /IRQ3 #), which 
is a bidirectional open-collector signal, can also be 
used as interrupt request input. When the interrupt 
function is enabled (through the Programmable In- 
terrupt Controller), a LOW on this input will generate 
an Interrupt Request 3# to the 82380 Programma- 
ble Interrupt Controller. This pin has a weak internal 
pull-up resistor. To use the IRQ3# function, Timer 2 — 
should be programmed so that OUT2 is LOW. Addi- 
tionally, OUT3 of Timer 3 is connected to an edge 
detector which will generate an Interrupt Request 0 
(IRQO) to the 82380 after the rising edge of OUT3 
(see Figure 5-1). 


5.2.3 GATE 


GATE is not an externally controllable signal. Rath- 
er, it can be software controlled with the Internal 
Control Port. The state of GATE is always sampled 
on the rising edge of CLKIN. Depending on the 
mode of operation, GATE is used to enable/disable 
counting or trigger the start of an operation. 


For Timer 0 and 1, GATE is always enabled (HIGH). 
For Timer 2 and 3, GATE is connected to Bit 0 and 
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6, respectively, of an Internal Control Port (at ad- 
dress 61H) of the 82380. After a hardware reset, the 
state of GATE of Timer 2 and 3 is disabled (LOW). 


5.3 Modes of Operation 


Each timer can be independently programmed to 
operate in one of six different modes. Timers are 
programmed by writing a Control Word into the con- 
trol Word Register followed by an Initial Count (see 
Programming). 


The following are defined for use in describing the 
different modes of operation. 


CLK Pulse—A rising edge, then a falling edge, in 
that order of CLKIN. 

Trigger—A rising edge of a timer’ s GATE input. 
Timer/Counter Loading—The transfer of a count 
from Count Register (CR) to Count Element (CE). 


Note that figures 5-3 through 5-8 show the logical 
outputs of the timer units, OUT,. This signal polarity 
does not reflect that of the TOUT, signals. See the 
first paragraph of Section 5.2.2. 


5.3.1 MODE 0O—INTERRUPT ON TERMINAL 
COUNT 


Mode 0 is typically used for event counting. After the 
Control Word is written, OUT is initially LOW, and will 
remain LOW until the counter reaches zero. OUT 
then goes HIGH and remains HIGH until a new 
count or a new Mode 0 Control Word is written into 
the counter. 


In this mode, GATE = HIGH enables counting; 
GATE = LOW disables counting. However, GATE 
has no effect on OUT. 


After the Control Word and initial count are written to 
a timer, the initial count will be loaded on the next 
CLK pulse. This CLK pulse does not decrement the 
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count, so for an initial count of N, OUT does not go 
HIGH until N + 1 CLK pulses after the initial count is 
written. 


If a new count is written to the timer, it will be loaded 
on the next CLK pulse and counting will continue 
from the new count. If a two-byte count is written, 
the following happens: 


1. Writing the first byte disables counting, OUT is set 
LOW immediately (i.e., no CLK pulse required). 


2. Writing the second byte allows the new count to 
be loaded on the next CLK pulse. 


This allows the counting sequence to be synchroniz- 
ed by software. Again, OUT does not go HIGH until 
N + 1 CLK pulses after the new count of N is writ- 
ten. 


If an initial count is written while GATE is LOW, the 
counter will be loaded on the next CLK pulse. When 
GATE goes HIGH, OUT will go HIGH N CLK pulses 
later; no CLK pulse is needed to load the counter as 
this has already been done. 


5.3.2 MODE 1—GATE RETRIGGERABLE 
ONE-SHOT 


In this mode, OUT will be initially HIGH. OUT will go 
LOW on the CLK pulse following a trigger to start the 
one-shot operation. The OUT signal will then remain 
LOW until the timer reaches zero. At this point, OUT 
will stay HIGH until the next trigger comes in. Since 
the state of GATE signals of Timer 0 and 1 are inter- 
nally set to HIGH. 


After writing the Control Word and initial count, the 
timer is considered ‘armed’. A trigger results in load- 
ing the timer and setting OUT LOW on the next CLK 
pulse. Therefore, an initial count of N will result in a 
one-shot pulse width of N CLK cycles. Note that this 
one-shot operation is retriggerable; i.e., OUT will re- 
main LOW for N CLK pulses after every trigger. The 
one-shot operation can be repeated without rewrit- 
ing the same count into the timer. 


If a new count is written to the timer during a one- 
shot operation, the current one-shot pulse width will 
not be affected until the timer is retriggered. This is 
because loading of the new count to CE will occur 
only when the one-shot is triggered. 
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NOTES: 
The following conventions apply to all mode timing diagrams. 
. Counters are programmed for binary (not BCD) counting and for reading/writing least significant byte (LSB) only. 
. The counter is always selected (CS always low). | 
. CW stands for “Control Word”; CW = 10 means a control word of 10, Hex is written to the counter. 
. LSB stands for “least significant byte” of count. 
. Numbers below diagrams are count values. 
The lower number is the least significant byte. 
The upper number is the most significant byte. Since the counter is programmed to read/write LSB only, the 
most significant byte cannot be read. 
N stands for an undefined count. 
Vertical lines show transitions between count values. 


Figure 5-3. Mode 0 
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Figure 5-4. Mode 1 


5.3.3 MODE 2—RATE GENERATOR 


This mode is a divide-by-N counter. It is typically 
used to generate a Real Time Clock interrupt. OUT 
will initially be HIGH. When the initial count has dec- 
remented to 1, OUT goes LOW for one CLK pulse, 
then OUT goes HIGH again. Then the timer reloads 
the initial count and the process is repeated. In other 
words, this mode is periodic since the same se- 
quence is repeated itself indefinitely. For an initial 
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count of N, the sequence repeats every N CLK cy- 
cles. | 


Similar to Mode 0, GATE = HIGH enables counting, 
where GATE = LOW disables counting. If GATE 
goes LOW during an output pulse (LOW), OUT is set 
HIGH immediately. A trigger (rising edge on GATE) 
will reload the timer with the initial count on the next 
CLK pulse. Then, OUT will go LOW (for one CLK 
pulse) N CLK pulses after the new trigger. Thus, 
GATE can be used to synchronize the timer. 
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A GATE transition should not occur one clock prior to terminal count. 


Figure 5-5. Mode 2 — 


After writing a Control Word and initial count, the 
timer will be loaded on the next CLK pulse. OUT 
goes LOW (for the CLK pulse) N CLK pulses after 
the initial count is written. This is another way the 
timer may be synchronized by software. 


Writing a new count while counting does not affect 
the current counting sequence because. the new 
count will not be loaded until the end of the current 


counting cycle. If a trigger is received after writing a. 


new count but before the end of the current period, 


the timer will be loaded with the new count on the 


‘next CLK pulse after the trigger, and counting will 


continue with the new count. 


5.3.4 MODE 3—SQUARE WAVE GENERATOR | 


Mode 3 is typically used for Baud Rate generation. 
Functionally, this mode is similar to Mode 2 except 
for the duty cycle of OUT. In this mode, OUT will be 
initially HIGH. When half of the initial count has ex- 
pired, OUT goes low for the remainder of the count. 
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The counting sequence will be repeated, thus this 
mode is also periodic. Note that an initial count of N 
results in a square wave with a period of N CLK 
pulses. 


The GATE input can be used to synchronize the tim- 
er. GATE = HIGH enables counting; GATE = LOW 
disables counting. If GATE goes LOW while OUT is 
LOW, OUT is set HIGH immediately (i.e., no CLK 
pulse is required). A trigger reloads the timer with the 
initial count on the next CLK pulse. 


After writing a Control Word and initial count, the 
timer will be loaded on the next CLK pulse. This al- 
lows the timer to be synchronized by software. 


Writing a new count while counting does not affect 
the current counting sequence. If a trigger is re- 
ceived after writing a new count but before the end 
of the current half-cycle of the square wave, the tim- 
er will be loaded with the new count on the next CLK 


CWx16 LSB=4 
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pulse and counting will continue from the new count. 
Otherwise, the new count will be loaded at the end 
of the current half-cycle. 


There is a slight difference in operation depending 
on whether the initial count is EVEN or ODD. The 
following description is to show exactly how this 
mode is implemented. 


EVEN COUNTS: 


OUT is initially HIGH. The initial count is loaded on 
one CLK pulse and is decremented by two on suc- 
ceeding CLK pulses. When the count expires (decre- 
mented to 2), OUT changes to LOW and the timer is 
reloaded with the initial count. The above process is 
repeated indefinitely. 


ODD COUNTS: 


OUT is initially HIGH. The initial count minus one 
(which is an even number) is loaded on one CLK 


i) 0 0 ie ie is & ie & a 
|" | | s | s ie mS ie 2 4 2 4 2 4 2 
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A-GATE transition should not occur one clock prior to terminal count. 


Figure 5-6. Mode 3 
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pulse and is decremented by two on succeeding 
CLK pulses. One CLK pulse after the count expires 
(decremented to 2), OUT goes LOW and the timer is 
loaded with the initial count minus one again. Suc- 
ceeding CLK pulses decrement the count by two. 
When the count expires, OUT goes HIGH immedi- 
ately and the timer is reloaded with the initial count 
minus one. The above process is repeated indefi- 
nitely. So for ODD counts, OUT will be HIGH for (N 
+ 1)/2 counts and LOW for (N — 1)/2 counts. 


5.3.5 MODE 4—INITIAL COUNT TRIGGERED © 
STROBE 


This mode allows a strobe pulse to be generated by 
writing an initial count to the timer. Initially, OUT will 


CW=18 LSB=3 
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be HIGH. When a new initial count is written into-the 
timer, the counting sequence will begin. When the 
initial count expires (decremented to 1), OUT will go 
LOW for one CLK pulse and then go HIGH again. 


Again, GATE = HIGH enables counting while GATE 
= LOW disables counting. GATE has no effect on 
OUT. | 


After writing the Control Word and initial count, the 
timer will be loaded on the next CLK pulse. This CLK 
pulse does not decrement the count, so for an initial 
count of N, OUT does not strobe LOW until N 7 1 
CLK pues after initial count Is written. 


If a new count is written during counting, it will be 
loaded in the next CLK pulse and counting will con- 
tinue from the new count. 
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Figure 5-7. Mode 4 
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lf a two-byte count is written, the following will occur: 
1. Writing the first byte has no effect on counting. 


2. Writing the second byte allows the new count to 
be loaded on the next CLK pulse. 


OUT will strobe LOW N + 1 CLK pulses after the 
new count of N is written. Therefore, when the 
strobe pulse will occur after a trigger depends on the 
value of the initial count loaded. 


5.3.6 MODE 5—GATE RETRIGGERABLE 
STROBE 


Mode 5 is very similar to Mode 4 except the count 
sequence is triggered by the GATE signal instead of 


LSB =3 
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by writing an initial count. Initially, OUT will be HIGH. 
Counting is triggered by a rising edge of GATE. 
When the initial count has expired (decremented to 
1), OUT will go LOW for one CLK pulse and then go 
HIGH again. 


After loading the Control Word and initial count, the 
Count Element will not be loaded until the CLK pulse 
after a trigger. This CLK pulse does not decrement 
the count. Therefore, for an initial count of N, OUT 
does not strobe LOW until N + 1 CLK pulses after a 
trigger. 


eta sor “al 


Ia | es tec alee 


LSB =5 


290128-73 


Figure 5-8. Mode 5 
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SUMMARY OF GATE OPERATIONS 


Disable Count 
No Effect 


1. Disable Count 

2. Sets Output HIGH 
Immediately 

1. Disable Count 

2. Sets Output HIGH 
Immediately 

Disable Count 

No Effect 


The counting sequence is retriggerable. Every trig- 
ger will result in the timer being loaded with the initial 
count on the next CLK pulse. 


If the new count is written during counting, the cur- 
rent counting sequence will not be affected. If a trig- 
ger occurs after the new count is written but before 
the current count expires, the timer will be loaded 
with the new count on the next CLK pulse and a new 
count sequence will start from there. 


5.3.7 OPERATION COMMON TO ALL MODES 


5.3.7.1 GATE 


The GATE input is always sampled on the rising 
edge of CLKIN. In Modes 0, 2, 3 and 4, the GATE 
input is level sensitive. The logic level is sampled on 
the rising edge of CLKIN. In Modes 1, 2, 3 and 5, the 
GATE input is rising edge sensitive. In these modes, 
a rising edge of GATE (trigger) sets an edge sensi- 
tive flip-flop in the timer. The flip-flop is reset imme- 
diately after it is sampled. This way, a trigger will be 
detected no matter when it occurs; i.e., a HIGH logic 
level does not have to be maintained until the next 
rising edge of CLKIN. Note that in Modes 2 and 3, 
the GATE input is both edge and level sensitive. 


5.3.7.2 Counter 


New counts are loaded and counters are decre- 
mented on the falling edge of CLKIN. The largest 
possible initial count is 0. This is equivalent to 2**16 
for binary counting and 10**4 for BCD counting. 


Note that the counter does not stop when it reaches 
zero. In Modes 0, 1, 4, and 5, the counter ‘wraps 
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No Effect 

1. Initiate Count 

2. Reset Output 
After Next Clock 

Initiate Count 


Enable Count 
No Effect 


Enable Count 


Initiate Count 


No Effect 
Initiate Count 


Enable Count 


Enable Count 
No Effect 


around’ to the highest count: either FFFF Hex for 
binary counting or 9999 for BCD counting, and con- 
tinues counting. Modes 2 and 3 are periodic. The 
counter reloads itself with the initial count and con- 
tinues counting from there. 


The minimum and maximum initial count in each 
counter depends on the mode of operation. They 
are summarized below. 


5.4 Register Set Overview 


The Programmable Interval Timer module of the 
82380 contains a set of six registers. The port ad- 
dress map of these registers is shown in Table 5-2. 


Table 5-2. Timer Register Port Address Map 


40H Counter 0 Register (read/write) 
41H Counter 1 Register (read/write) 
42H Counter 2 Register (read/write) 
43H Control Word Register | 


(Counter 0, 1 & 2) (write-only) 


44H Counter 3 Register (read/write) 
45H Reserved 

46H Reserved 

47H Control Word Register II 


(Counter 3) (write-only) 
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5.4.1 COUNTER 0, 1, 2, 3 REGISTERS 


These four 8-bit registers are functionally identical. 
They are used to write the initial count value into the 
respective timer. Also, they can be used to read the 
latched count value of a timer. Since they are 8-bit 
registers, reading and writing of the 16-bit initial 
count must follow the count format specified in the 
Control Word Registers; i.e., least significant byte 
only, most significant byte only, or least significant 
byte then most significant byte (see Programming). 


5.4.2 CONTROL WORD REGISTER | & Il 


There are two Control Word Registers associated 
with the Timer section. One of the two registers 
(Control Word Register !) is used to control the oper- 
ations of Counters 0, 1, and 2 and the other (Control 
Word Register Il) is for Counter 3. The major func- 
tions of both Control Word Registers are listed be- 
low: 


— Select the timer to be programmed. 


— Define which mode the selected timer is to oper- 
ate in. 


— Define the count sequence; i.e., if the selected 
timer is to count as a Binary Counter or a Binary 
Coded Decimal (BCD) Counter. 


— Select the byte access sequence during timer 
read/write operations; i.e., least significant byte 
only, most significant byte only, or least signifi- 
cant byte first, then most significant byte. 


Also, the Control Word Registers can be pro- 
grammed to perform a Counter Latch Command or a 
Read Back Command which will be described later. 


5.5 Programming 


5.5.1 INITIALIZATION 


Upon power-up or reset, the state of all timers is 
undefined. The mode, count value, and output of all 
timers are random. From this point on, how each 
timer operates is determined solely by how it is pro- 
grammed. Each timer must be programmed before it 
can be used. Since the outputs of some timers can 
generate interrupt signals to the 82380, all timers 
should be initialized to a known state. 


Timers are programmed -by writing a Control Word 
into their respective Control Word Registers. Then, 
an Initial Count can be written into the correspond- 
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ing Count Register. In general, the programming pro- 
cedure is very flexible. Only two conventions need to 
be remembered: 


1. For each timer, the Control Word must be written 
before the initial count is written. 


2. The 16-bit initial count must follow the count for- 
mat specified in the Control Word (least signifi- 
cant byte only, most significant byte only, or least 
significant byte first, followed by most significant 


byte). 


Since the two Control Word Registers and the four 
Counter Registers have separate addresses, and 
each timer can be individually selected by the appro- 
priate Control Word Register, no special instruction 
sequence is required. Any programming sequence 
that follows the conventions above is acceptable. 


A new initial count may be written to a timer at any 
time without affecting the timer’s programmed mode 
in any way. Count sequence will be affected as de- 
scribed in the Modes of Operation section. Note that 
the new count must follow the programmed count 
format. 


lf a timer is previously programmed to read/write 
two-byte counts, the following precaution applies. A 
program must not transfer control between writing 
the first and second byte to another routine which 
also writes into the same timer. Otherwise, the 
read/write will result in incorrect count. 


Whenever a Control Word is written to a timer, all 
control logic for that timer(s) is immediately reset 
(i.e., no CLK pulse is required). Also, the corre- 
sponding output pin, TOUT(#), goes to a known ini- 
tial state. 


5.5.2 READ OPERATION 


Three methods are available to read the current 
count as well as the status of each timer. They are: 
Read Counter Registers, Counter Latch Command 
and Read Back Command. Folew ng isa descrip- 
tion of these methods. 


READ COUNTER REGISTERS 


The current count of a timer can be read by perform- 
ing a read operation on the corresponding Counter 
Register. The only restriction of this read operation 
is that the CLKIN of the timers must be inhibited by 
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using external logic. Otherwise, the count may be in 
the process of changing when it is read, giving an 
undefined result. Note that since all four timers are 
sharing the same CLKIN signal, inhibiting CLKIN to 
read a timer will unavoidably disable the other timers 
also. This may prove to be impractical. Therefore, it 
is suggested that either the Counter Latch Com- 
mand or the Read Back Command be used to read 
the current count of a timer. 


Another alternative is to temporarily disable a timer 
before reading its Counter Register by using the 
GATE input. Depending on the mode of operation, 
GATE = LOW will disable the counting operation. 
However, this option is available on Timer 2 and 3 
only, since the GATE signals of the other two timers 
are internally enabled all the time. 


COUNTER LATCH COMMAND 


A Counter Latch Command will be executed when- 
ever a special Control Word is written into a Control 
Word Register. Two bits written into the Control 
Word Register distinguish this command from a ‘reg- 
ular’ Control Word (see Register Bit Definition). Also, 
two other bits in the Control Word will select which 
counter is to be latched. 


Upon execution of this command, the selected 


counter’s Output Latch (OL) latches the count at the 
time the Counter Latch Command is received. This 
count is held in the latch until it is read by the 80386, 
or until the timer is reprogrammed. The count is then 
unlatched automatically and the OL returns to ‘fol- 
lowing’ the Counting Element (CE). This allows read- 
ing the contents of the counters ‘on the fly’ without 
affecting counting in progress. Multiple Counter 
Latch Commands may be used to latch more than 
one counter. Each latched count is held until it is 


read. Counter Latch Commands do not affect the — 


programmed mode of the timer in any way. 


If a counter is latched, and at some time later, it is 
latched again before the prior latched count is read, 
the second Counter Latch Command is ignored. The 
count read will then be the count at the time the first 
command was issued. 


In any event, the latched count must be read ac- 
cording to the programmed format. Specifically, if 
the timer is programmed for two-byte counts, two 
bytes must be read. However, the two bytes do not 
have to be read right after the other. Read/write or 
programming operations of other timers may be per- 
formed between them. 
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Another feature of this Counter Latch Command is 
that read and write operations of the same timer 


‘may be interleaved. For example, if the timer is pro- 


grammed for two-byte counts, the following: se- 
quence is valid. | | 


1. Read least significant byte. 
2. Write new least significant byte. 


3. Read most significant byte. 


4. Write new most significant byte. 


lf a timer is programmed to read/write two-byte 
counts, the following precaution applies. A program 
must not transfer control between reading the first 
and second byte to another routine which also reads 
from that same timer. Otherwise, an incorrect count 
will be read. . 


READ BACK COMMAND 


The Read Back Command is another special Com- 
mand Word operation which allows the user to read 
the current count value and/or the status of the se- 
lected timer(s). Like the Counter Latch Command, 
two bits in the Command Word identify this as a 
Read Back Command (see Register Bit Definition). 


The Read Back Command may be used to latch 


multiple counter Output Latches (OL’s) by selecting 
more than one timer within a Command Word. This 
single command is functionally equivalent to several 
Counter Latch Commands, one for each counter to 
be latched. Each counter’s latched count will be 


_ held until it is read by the 80386 or until the timer is 


reprogrammed. The counter is automatically un- 
latched when read, but other counters remain 


latched until they are read. If multiple Read Back 


commands are issued to the same timer without 
reading the count, all but the first are ignored; i.e., 
the count read will correspond to the very first Read 
Back Command issued. 


As mentioned previously, the Read Back Command 
may also be used to latch status information of the 
selected timer(s). When this function is enabled, the 
status of a timer can be read from the Counter Reg- 


ister after the Read Back Command is issued. The 


status information of a timer includes the following: 
1. Mode of timer: 


This allows the user to check the mode of opera: 
tion of the timer last programmed. 


2. State of TOUT pin of. the timer: 


This allows the user to monitor the counter’s out- 
put pin via software, possibly euipene some 
hardware from a system. 
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3. Null Count/Count available: 


The Null Count Bit in the status byte indicates if 
the last count written to the Count Register (CR) 
has been loaded into the Counting Element (CE). 
The exact time this happens depends on the 
mode of the timer and is described in the Pro- 
gramming section. Until the count is loaded into 

_ the Counting Element (CE), it cannot be read from 
the timer. If the count is latched or read before 
this occurs, the count value will not reflect the 
new count just written. 


If multiple status latch operations of the timer(s) are 
performed without reading the status, all but the first 
command are ignored; i.e., the status read in will 
correspond to the first Read Back Command issued. 


Both the current count and status of the selected 
timer(s) may be latched simultaneously by enabling 
both functions in a single Read Back Command. 
This is functionally the same as issuing two separate 
Read Back Commands at once. Once again, if multi- 
ple read commands are issued to latch both the 
count and status of a timer, a but the first command 
will be ignored. 
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If both count and status of a timer are latched, the 
first read operation of that timer will return the 
latched status, regardless of which was latched first. 
The next one or two (if two count bytes are to be 
read) read operations return the latched count. Note 
that subsequent read operations on the Counter 
Register will return the unlatched count ane the first 
read method discussed). 


5.6 Register Bit Definitions 


COUNTER 0, 1, 2, 3 REGISTER (READ/WRITE) 


Counter 0 Register (read/write) 
Counter 1 Register (read/write) 


Counter 2 Register (read/write) 
Counter 3 Register (read/write) 
Reserved 
Reserved 


07 | oe fos | m4 fos | 2 | or] 00 


LSB OF COUNT BYTE 
MSB OF COUNT BYTE 
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Note that these 8-bit registers are for writing and 
reading of one byte of the 16-bit count value, either 
the most significant or the least significant byte. 


CONTROL WORD REGISTER l& Il (WRITE-ONLY) 


Control Word Register | 
(Counter 0, 1, 2) (write-only) 
Control Word Register II 

(Counter 3) (write-only) 


CONTROL WORD REGISTER | 


O'= 16=BIT BINARY 
COUNTER 
1=-BCD COUNTER 
(4 DECADES) 


SELECT COUNTER: 

00 SELECT COUNTER 0 

01 SELECT COUNTER 1 

10 SELECT COUNTER 2 
11 READ BACK COMMAND 

FOR COUNTER 0-2 


MODE: 
000 MODE 0 
- Q01 MODE 1 
X10 MODE 2 
X11 MODE 3 
100 MODE 4 
101 MODE 5 
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rigger 
/0| 1 | 2| 3 | Edge] Level 
| X 
NAINA} @| @ 
NA} NA! @ | © 


© = Must use Port 61 to generate _“" edge. 
NA = Not Applicable 


READ/WRITE: 
00 COUNTER LATCH COMMAND 
01 READ/WRITE LSB BYTE ONLY 
10 READ/WRITE MSB BYTE ONLY 
11 READ/WRITE LSB, THEN MSB BYTE 


0 
1 
2 
3 
4 


{or 
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CONTROL WORD REGISTER II _ 


SELECT COUNTER: 
00 SELECT COUNTER 3 
01 RESERVED 
10 RESERVED 
11 READ BACK COMMAND 

FOR COUNTER 3 
| | 

READ/WRITE: 
00 COUNTER LATCH COMMAND 
01 READ/WRITE LSB BYTE ONLY 
10 READ/WRITE MSB BYTE ONLY 
11 READ/WRITE LSB, THEN MSB BYTE 


' O~ 16=BiT BINARY 
COUNTER | 


1=BCD COUNTER 


(4 DECADES) 


MODE: 


000 MODE 0 
001 MODE 1 
X10 MODE 2 
X11.MODE 3 
100 MODE 4 


101 MODE 5 
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COUNTER LATCH COMMAND FORMAT 
(Write to Control Word Register) 


00 COUNTER 0 (OR 3) 
01 COUNTER 1 

10 COUNTER 2 | 

11 READ BACK COMMAND 
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Interrupt on Terminal Count 

_ Gate Retriggerable One Shot 
Rate Generator 
Square Wave Generator 
Initial Count Triggered Strobe 
Gate Retriggerable Strobe 
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READ BACK COMMAND FORMAT 
(Write to Control Word Register) 


D7 ~—sé6 DS D4 =siSBsti‘<ié«éi :t:té‘«C 
|} 4 | 1  |countg| status] cnt2 } cntt [cnT0/3 


O=- LATCH COUNT 
1= DO NOT LATCH 
COUNT 
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O= COUNTER NOT 
SELECTED 

1 = COUNTER IS 
SELECTED 


O= LATCH STATUS 
1=- DO NOT LATCH 


STATUS 


STATUS FORMAT | 
(Returned from Read Back Command) 


D7 D6 D5 D4 
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D1 


Te Tr [vo [2 | 


a 


O= OUTPUT 
PIN =0 

1 = OUTPUT 
PIN=1 


6.0 WAIT STATE GENERATOR 


6.1 Functional Description 


The 82380 contains a programmable Wait State 
Generator which can generate a pre-programmed 
number of wait states during both CPU and DMA 
initiated bus cycles. This Wait State Generator is ca- 
pable of generating 1 to 16 wait states in non-pipe- 


O= COUNT AVAILABLE 
FOR READING 
1=- NULL COUNT 


COUNTER 


MODE 
290128-79 


lined mode, and 0 to 15 wait states in pipelined 
mode. Depending on the bus cycle type and the two 
Wait State Control inputs (WSC 0-1), a pre-pro- 
grammed number of wait states in the selected Wait 


State Register will be generated. 


The Wait State Generator can also be disabled to 
allow the use of devices capable of generating their 
own READY # signals. Figure 6-1 is a block diagram 
of the Wait State Generator. 
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6.2 Interface Signals 


The following describes the interface signals which 
affect the operation of the Wait State Generator. 
The READY #, WSCO and WSC‘1 signals are inputs. 
READYO# is the ready output signal to the host 
processor. | | 


6.2.1 READY # 


READY # is an active LOW input signal which indi- 
cates to the 82380 the completion of a bus cycle. In 
the Master mode (e.g., 82380 initiated DMA trans- 
fer), this signal is monitored to determine whether a 
peripheral or memory needs wait states inserted in 
the current bus cycle. In the Slave mode, it is used 
(together with the ADS# signal) to trace CPU bus 
cycles to determine if the current cycle is pipelined. 


6.2.2 READYO # 


READYO# (Ready Out#) is an active LOW output — 
signal and is the output of the Wait State Generator. . 


The number of wait states generated depends on 
the WSC(0-1) inputs. Note that special cases are 


D7 


D4 D3 DO 
REGISTER MEMORY 1 \/o 1 , 
ie seLect | __ MEMORY 2 1/0 2 | . 
M/lO# eco | ee WAIT STATE 
(RESERVED) | REFRESH COUNTER 


PROGRAMMABLE WAIT STATE 


Figure 6-1. Wait State Generator Block Diagram | 


handled for access to the 82380 internal registers 
and for the Refresh cycles. For 82380 internal regis- 
ter access, READYO# will be delayed to take into 
account the command recovery time of the register. 
One or more wait states will be generated in a pipe- 
lined cycle. During refresh, the number of wait states 
will be determined by the preprogrammed value in 
the Refresh Wait State Register. 


In the simplest configuration, READYO# can be 


connected to the READY # input of the 82380 and 


the 80386 CPU. This is, however, not always the 
case. If external circuitry is to control the READY # 
inputs as well, additional logic will be required (see 
Application Issues). aa 


6.2.3 WSC(0-1) 
These two Wait State Control inputs select one of 


the three pre-programmed 8-bit Wait State Registers 
which determines the number of wait states to be 


- generated. The most significant half of the three 
Wait State Registers corresponds to memory ac- 


cesses, the least significant half to 1/O accesses. 
The combination WSC(0-1) = 11 disables the Wait 
State Generator. 


INTERNAL WAIT STATE 


READYO# 
REQUIREMENT | 


REGISTERS 


-’ 290128-80 
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Figure 6-2. Wait States in Non-Pipelined Cycles 


6.3 Bus Function 


6.3.1 WAIT STATES IN NON-PIPELINED CYCLE 


The timing diagram of two typical non-pipelined cy-. 


cles with 82380 generated wait states is shown in 
Figure 6-2. In this diagram, it is assumed that the 
internal registers of the 82380 are not addressed. 
During the first T2 state of each bus cycle, the Wait 
State Control and the M/IO# inputs are sampled to 
determine which Wait State Register (if any) is se- 
lected. If the WSC inputs are active (i.e., not both are 
driven HIGH), the pre-programmed number of wait 
states corresponding to the selected Wait State 
Register will be requested. This is done by driving 
the READYO# output HIGH during the end of each 
T2 state. 


The WSC(0-1) inputs need only be valid during the 
very first T2 state of each non-pipelined cycle. As a 
general rule, the WSC inputs are sampled on the 


rising edge of the next clock (82384 CLK) after the 
last state when ADS # (Address Status) is asserted. 


The number of wait states generated depends on 
the type of bus cycle, and the number of wait states 
requested. The various combinations are discussed 
below. 


1. Access the 82380 internal registers: 2 to 5 wait 
states, depending upon the specific register ad- 
dressed. Some back-to-back sequences to the In- 
terrupt Controller will require 7 wait states. 


2. Interrupt Acknowledge to the 82380: 5 wait 
states. 


3. Refresh: As programmed in the Refresh Wait 
State Register (see Register Set Overview). Note 
that if WSC(O-1) = 11, READYO# will stay inac- 
tive. 


4. Other bus cycles: Depending on WSC(0-1) and 
M/lO# inputs, these inputs select a Wait State 
Register in which the number of wait states will be 
equal to the pre-programmed wait state count in 
the register plus 1. The Wait State Register selec- 
tion is defined as follows (Table 6-1). 
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Table 6-1. Wait State te Register Selection 


WAIT REG 0 (I/O half) 
WAIT REG 1 (I/O half) 
WAIT REG 2 (I/O half) 


WAIT REG 0 (MEM half) 
WAIT REG 1 (MEM half) 
WAIT REG 2 (MEM half) 
Wait State Gen. Disabled 


The Wait State Control signals, WSC(0-1), can be 
generated with the address decode and the Read/ 
Write control signals as shown in Figure 6-3. 


ADDRESS DECODE 
LOGIC WSC c (0 - 1) 
W/R# 
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Figure 6-3. WSC(0-1) Generation | 
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CLK © 
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BE(O~ 3)# 
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READY# 


READYO# 


ONE WAIT STATE | 
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Note that during HALT and SHUTDOWN, the num- 
ber of wait states will depend on the WSC(0-1) in- 
puts, which will select the memory half of one of the 
Wait State Registers (see CPU Reset and Shutdown 
Detect). 


6.3.2 WAIT STATES IN PIPELINED CYCLE 


The timing diagram of two typical pipelined cycles 
with 82380 generated wait states is shown in Figure 
6-4. Again, in this diagram, it is assumed that the 
82380 internal registers are not addressed. As de- 
fined in the timing of the 80386 processor, the Ad- 
dress (A 2-31), Byte Enable (BE 0-3), and other 
control signals (M/IO#, ADS#) are asserted one 
T state earlier than in a non-pipelined cycle; i.e., they 
are asserted at T2P. Similar to the non-pipelined 
case, the Wait State Control (WSC) inputs are sam- 
pled in the middle of the state after the last state 
when the ADS# signal is asserted. Therefore, the 
WSC inputs should be asserted during the T1P state 
of each pipelined cycle (which is one T state earlier 
than in the non-pipelined cycle). 


TWO WAIT STATES 
290128-83 


Figure 6-4. Wait State in Pipelined Cycles 
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The number of wait states generated in a pipelined 
cycle is selected in a similar manner as in the non- 
pipelined case discussed in the previous section. 
The only difference here is that the actual number of 
wait states generated will be one less than that of 


the non-pipelined cycle. This is done cl 


by the Wait vee Generator. 


6.3.3 EXTENDING AND EARLY TERMINATING 
BUS CYCLE 


The 82380 allows external logic to either add wait 
states or cause early termination of a bus cycle by 
controlling the READY # input to the 82380 and the 
host processor. A possible configuration is shown in 
Figure 6-5. 


The EXT. RDY# (External Ready) signal of Figure 
6-5 allows external devices to cause early termina- 
tion of a bus cycle. When this signal is asserted 
LOW, the output of the circuit will also go LOW 
(even though the READYO# of the 82380 may still 


82380 


be HIGH). This output is fed to the READY # input of 
the 80386 and the 82380 to indicate the completion 
of the current bus cycle. 


Similarly, the EXT. NOT READY (External Not 
Ready) signal is used to delay the READY # input of 
the processor and the 82380. As long as this signal 
is driven HIGH, the output of the circuit will drive the 
READY # input HIGH. This will effectively extend the 
duration of a bus cycle. However, it is important to 
note that if the two-level logic is not fast enough to 
satisfy the READY # setup time, the OR gate should 
be eliminated. Instead, the 82380 Wait State Gener- 
ator can be disabled by driving both WSC(0—1) 
HIGH. In this case, the addressed memory or |/O 
device should activate the external READY # input 
whenever it is ready to terminate the current bus 
cycle. 


Figure 6-6 and 6-7 show the timing relationships of 
the ready signals for the early termination and exten- 
sion of the bus cycles. Section 6.7, Application Is- _ 
sues, contains a detailed timing analysis of the ex- 
ternal circuit. 
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Figure 6-6. Early Termination of Bus Cycle By ‘READY #’ 
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Figure 6-7. Extending Bus Cycle by ‘READY #’ 


Due to the following implications, it should be noted 
that early termination of bus cycles in which 82380 
internal registers are accessed is not recommended. 


1. Erroneous data may be read from or written into 
the addressed register. 


2. The 82380 must be allowed to recover either be- 
fore HLDA (Hold Acknowledge) is asserted or be- 
fore another bus cycle into an 82380 internal reg- 
ister is initiated. 


The recovery time, in bus periods, equals the re- 


maining wait states that were avoided plus 4. 


6.4 Register Set Overview 


Altogether, there are four 8-bit internal registers as- 
sociated with the Wait State Generator. The port ad- 
dress map of these registers is shown below in Ta- 
ble 6-2. A detailed description of each follows. 


Table 6-2. Register Address Map 


Port Address 


Wait State Reg 0 (read/write) 


Wait State Reg 1 (read/write) 
Wait State Reg 2 (read/write) 
Ref. Wait State Reg (read/write) 


WAIT STATE REGISTER 0, 1, 2 


These three 8-bit read/write registers are functional- 
ly identical. They are used to store the pre-pro- 
grammed wait state count. One half of each register 
contains the wait state count for I/O accesses while 
the other half contains the count for memory ac- 
cesses. The total number of wait states generated 
will depend on the type of bus cycle. For a non-pipe- 
lined cycle, the actual number of wait states request- 


ed is equal to the wait state count plus 1. For a 


pipelined cycle, the number of wait states will be 
equal to the wait state count in the selected register. 
Therefore, the Wait State Generator is capable of 
generating 1 to 16 wait states in non-pipelined 
mode, and 0 to 15 wait states in pipelined mode. 


Note that the minimum wait state count in each reg- 
ister is 0. This is equivalent to 0 wait states for a 
pipelined cycle and 1 wait state for a non-pipelined 
cycle. 


REFRESH WAIT STATE REGISTER 


Similar to the Wait State Registers discussed above, 
this 4-bit register is used to store the number of wait 
states to be generated during the DRAM refresh cy- 
cle. Note that the Refresh Wait State Register is not 
selected by the WSC inputs. It will automatically be 
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chosen whenever a DRAM refresh cycle occurs. If 
the Wait State Generator is disabled during the re- 
fresh cycle (WSC(0-1) = 11), READYO# will stay 
inactive and the Refresh Wait State Register is ig- 
nored. 


6.5 Programming 


Using the Wait State Generator is relatively straight- 
forward. No special programming sequence is re- 
quired. In order to ensure the expected number of 
wait states will be generated when a register is se- 
lected, the registers to be used must be pro- 
grammed after power-up by writing the appropriate 
wait state count into each register. Note that upon 
hardware reset, all Wait State Registers are initial- 
ized with the value FFH, giving the maximum num- 
ber of wait states possible. Also, each register can 
be read to check the wait state count previously 
stored in the register. | 


6.6 Register Bit Definition 
WAIT STATE REGISTER 0, 1, 2. 


72H Wait State Register 0 (read/write) 


73H Wait State Register 1 (read/write) 
74H Wait State Register 2 (read/write) 


82380 


7 | v6 | os | m4 | os | v2 | os | 0 | 
ey. 


1/0 WAIT 
STATE COUNT 


MEMORY WAIT STATE COUNT 
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REFRESH WAIT STATE REGISTER 


Port Address: 75H (Read/Write) 


T= [=[~[os[o [ope 
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REFRESH WAIT 
STATE COUNT 


MUST BE ZERO 
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6.7 Application Issues 


6.7.1 EXTERNAL ‘READY’ CONTROL LOGIC 


As mentioned in section 6.3.3, wait state cycles gen- 
erated by the 82380 can be terminated early or ex- 
tended longer by means of additional external logic 
(see Figure 6-5). In order to ensure that the 
READY # input timing requirement of the 80386 and 
the 82380 is satisfied, special care must be taken 
when designing this external control logic. This sec- 
tion addresses the design requirements. 
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A simplified block diagram of the external logic along 
with the READY # tiiming diagram is shown in Figure 
6-8. The purpose is to determine the maximum delay 
‘time allowed in the external control logic in order to 
satisfy the READY # setup time. 


First, it will be assumed that the 80386 is running at 
16 MHz (i.e., CLK2 and 32 MHz). Therefore, one bus 
state (two CLK2 periods) will be equivalent to 62.5 
nsec. According to the AC specifications of the 


EXT. READY# 


80386 = 16 


READYO# 


READY# 


= PHI1 + PHI2 = 62.5 ns 
= Maximum READYO# Valid Delay = 31 ns 
= READY # Set-up Time = 21ns_ ~— 


A 
B 
C 
D 


Maximum Ready Control Logic Delay = A — B — C = 11 ns 


82380 | 


82380, the maximum delay time for valid READYO # 
signal is 31 ns after the rising edge of CLK2 in the 
beginning of T2 (for non-pipelined cycle) or T2P (for 
pipelined cycle). Also, the minimum READY # setup 
time of the 80386 and the 82380 should be 20 ns 
before the rising edge of CLK2 at the beginning of 
the next bus state. This limits the total delay time for 
the external READY # control logic to be 11 ns 
(62.5-31-21) in order to meet the READY # setup 
timing requirement. 


EXT. NOT READY 


AX XX A LANA AA AKIN 
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Figure 6-8. ‘READY’ Timing Consideration 
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7.0 DRAM REFRESH CONTROLLER 


7.1 Functional Description 


The 82380 DRAM Refresh Controller consists of a 
24-bit Refresh Address Counter and Refresh Re- 
quest logic for DRAM refresh operations (see Figure 
7-1). TIMER 1 can be used as a trigger signal to the 
DRAM Refresh Request logic. The Refresh Bus Size 
can be programmed to be 8-, 16-, or 32-bit wide. 
Depending on the Refresh Bus Size, the Refresh 
Address Counter will be incremented with the appro- 
priate value after every refresh cycle. The internal 
logic of the 82380 will give the Refresh operation the 
highest priority in the bus control arbitration process. 
Bus control is not released and re-requested if the 
82380 is already a bus master. 


TOUT1 


DRAM 


82380 


7.2 Interface Signals 


7.2.1 TOUT1/REF # 


The dual function output pin of TIMER 1 (TOUT1/ 
REF #) can be programmed to generate DRAM Re- 
fresh signal. If this feature is enabled, the rising edge 
of TIMER 1 output (TOUT1) will trigger the DRAM 
Refresh Request logic. After some delay for gaining 
access of the bus, the 82380 DRAM Controller will 
generate a DRAM Refresh signal by driving REF # 
output LOW. This signal is cleared after the refresh 
cycle has taken place, or by a hardware reset. 


If the DRAM Refresh feature is disabled, the 
_ TOUT1/REF # output pin is simply the TIMER 1 out- 


put. Detailed information of how TIMER 1 operates 
is discussed in section 6—Programmable Interval 
Timer, and will not be repeated here. 
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REFRESH INTERNAL 
CONTROLLER DMA DMA 
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Figure 7-1. DRAM Refresh Controller 
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7.3 BusFunction — 


7.3.1 ARBITRATION 


In order to ensure data integrity of the DRAMs, the 
82380 gives the DRAM Refresh signal the highest 
priority in the arbitration logic. It allows DRAM Re- 
fresh to interrupt a DMA in progress in order to per- 
form the DRAM Refresh cycle. The DMA service will 
be resumed after the refresh is done. 


In case of a DRAM Refresh during a DMA process, 

-the cascaded device will be requested to get off the 
bus. This is done by deasserting the EDACK signal. 
Once DREQn goes inactive, the 82380 will perform 
the refresh operation. Note that the DMA controller 
does not completely relinquish the system bus dur- 
ing refresh. The Refresh Generator simply ‘steals’ a 
bus cycle between DMA accesses. 


Figure 7-2 shows the timing diagram of a Refresh 


Cycle. Upon expiration of TIMER 1, the 82380 will try 


to take control of the system bus by asserting 
HOLD. As soon as the 82380 see HLDA go active, 
the DRAM Refresh Cycle will be carried out by acti- 
vating the REF # signal as well as the refresh ad- 
dress and control signals on the system bus (Note 


Tx Tx 
CLK2 


CLK 


> 


HOLD 


HLDA 


A(2=31)* M/lO# 
D/C# BE (0-3)4# W/R# 


TOUT1 


OX AXAAY 


REF# 


READY# 
l ] 


ADS# XYXXXXXXX) 


*NOTE: 


A24-A31 = 1 during Refresh cycle. 


Ti 


a = in 


rt } 
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that REF # will not be active until two CLK periods 
after HLDA is asserted). The address bus will con- 
tain the 24-bit address currently in the Refresh Ad- 
dress Counter. The control signals are driven the 
same way as in a Memory Read cycle. This ‘read’ 
operation is complete when the READY# signal is 
driven LOW. Then, the 82380 will relinquish the bus 
by de-asserting HOLD. Typically, a Refresh Cycle 
without wait states will take five bus states to exe- 
cute. If ‘n’ wait states are added, the Refresh wee 
will last for five plus ‘n’ bus states. 


How often the Refresh Generation will initiate a re- 
‘ fresh cycle depends on the frequency of CLKIN as 


well as TIMER1’s programmed mode of operation. 
For this specific application, TIMER1 should be pro- 
grammed to operate in Mode 2 or 3 to generate a 
constant clock rate. See section 6—Programmable 
Interval Timer for more information on programming 
the timer. One DRAM Refresh Cycle will be generat- 
ed each time TIMER 1 expires (when TOUT1 chang- 
es to LOW to HIGH). 


The Wait State Generator can be used to insert wait 
states during a refresh cycle. The 82380 will auto- 
matically insert the desired number of wait states as 
programmed in the Refresh Wait State Register (see 
Wait State Generator). — 


Ti 1 T2 Ti 
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yt 
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Figure 7-2. 82380 Refresh Cycle 


5-1174 


Intel 
7.4 Modes of Operation 


7.4.1 WORD SIZE AND REFRESH ADDRESS 
COUNTER — 


The 82380 supports 8-, 16- and 32-bit refresh cycle. 
The bus width during a refresh cycle is programma- 
ble (see Programming). The bus size can be pro- 
grammed via the Refresh Control Register (see Reg- 
ister Overview). If the DRAM bus size is 8-, 16-, or 
32-bits, the Refresh Address Counter will be incre- 
mented by 1, 2, or 4, respectively. 


The Refresh Address Counter is cleared by a hard- 
ware reset. | 


7.5 Register Set Overview 


The Refresh Generator has two internal registers to 
control its operation. They are the Refresh Control 
Register and the Refresh Wait State Register. Their 
port address map is shown in Table 7-1 below. 


1CH Refresh Control Reg. (read/write) 
75H Ref. Wait State Reg. (read/write) 


Table 7-1. Register Address Map 


The Refresh Wait State Register is not part of the 
Refresh Generator. It is only used to program the 
number of wait states to be inserted during a refresh 
cycle. This register is discussed in detail in section 7 
(Wait State Generator) and will not be repeated 
here. 


~-REFRESH CONTROL REGISTER 


This 2-bit register serves two functions. First, it is 
used to enable/disable the DRAM Refresh function 
output. If disabled, the output of TIMER 1 is simply 
used as a general purpose timer. The second func- 
tion of this register is to program the DRAM bus size 
for the refresh operation. The programmed bus size 
also determines how the Refresh Address Counter 
will be incremented after each refresh operation. 


7.6 Programming 


Upon hardware reset, the DRAM Refresh function is 
disabled (the Refresh Control Register is cleared). 


The following programming steps are needed before / 


the Refresh Generator can be used. Since the rate 
of refresh cycles depends on how TIMER 1 is pro- 
grammed, this timer must be initialized with the de- 
sired mode of operation as well as the correct re- 


fresh interval (see Programming Interval Timer). | 


82380 


Whether or not wait states are to be generated dur- 
ing a refresh cycle, the Refresh Wait State Register 
must also be programmed with the appropriate val- 
ue. Then, the DRAM Refresh feature must be en- 


abled and the DRAM bus width should be defined. 


These can be done in one step by writing the appro- 
priate control word into the Refresh Control Register 
(see Register Bit Definition). After these steps are 
done, the refresh operation will automatically be in- 
voked by the Refresh Generator upon expiration of 
Timer 1. 


In addition to the above programming steps, it 
should be noted that after reset, although the 
TOUT1/REF# becomes the Timer 1 output, the 
state of this pin is undefined. This is because the 
Timer module has not been initialized yet. Therefore, 
if this output is used as a DRAM Refresh signal, this 
pin should be disqualified by external logic until the 
Refresh function is enabled. One simple solution is 
to logically AND this output with HLDA, since HLDA 
should not be active after reset. 


7./ Register Bit Definition 


REFRESH CONTROL REGISTER 
Port Address:' 1CH (Read/Write) 


Bn BRE es 
ee bees 


0O REF. DISABLE 
01 BUS SIZE = 32 
10 BUS SIZE=16 
11 BUS SIZE=8 
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MUST BE ZERO 


8.0 RELOCATION REGISTER AND 
ADDRESS DECODE 


8.1 Relocation Register 


All the integrated peripheral devices in the 82380 
are controlled by a set of internal registers. These 
registers span a total of 256 consecutive address 
locations (although not all the 256 locations are 
used). The 82380 provides a Relocation Register 
which allows the user to map this set of internal reg- 
isters into either the memory or I/O address space. 
The function of the Relocation Register is to define 
the base address of the internal register set of the 
82380 as well as if the registers are to be memory- 
or |/O-mapped. The format of the Relocation Regis- 
ter is depicted in Figure 8-1. 
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Figure 8-1. Relocation Register 


Note that the Relocation Register is part of the inter- 
nal register set of the 82380. It has a port address of 
7FH. Therefore, any time the content of the Reloca- 
tion Register is changed, the physical location of this 
register will also be moved. Upon reset of the 82380, 
the content of the Relocation Register will be 
cleared. This implies that the 82380 will respond to 
its |/O addresses in the range of OOOOH to OOFFH. 


8.1.1 1/O-MAPPED 82380 


As shown in the figure, Bit 0 of the Relocation Regis- 
ter determines whether the 82380 registers are to be 
memory-mapped or |/O-mapped. When Bit 0 is set 
to ‘0’, the 82380 will respond to |/O Addresses. Ad- 
dress signals BEO#-BE3#, A2-A7 will be used to 
select one of the internal registers to be accessed. 
Bit 1 to Bit 7 of the Relocation Register will corre- 
spond to AQ to A15 of the Address bus, respectively. 
‘Together with A8 implied to be ‘0’, A15 to A8 will be 
fully decoded by the 82380. The following shows 
how the 82380 is mapped into the I/O address 
space. 


Example 


Relocation Register = 11001110 (OCEH) 


82380 will respond to I/O address range from 
OCEOOH to OCEFFH. 


Therefore, this |1‘O mapping mechanism allows the 
82380 internal registers to be located on any even, 
contiguous, 256 byte pouncery of the system I/O 
space. 


Port Address: 7FH (Read/Write) 


8.1.2 MEMORY-MAPPED 82380 


When Bit 0 of the Relocation Register is set to ‘1’, 
the 82380 will respond to memory addresses. Again, 
Address signals BEO#-BE3#, A2-A7 will be used 
to select one of the internal registers to be ac- 
cessed. Bit 1 to Bit 7 of the Relocation Register will 
correspond to A25-—A31, respectively. A24 is as- 
sumed to be ‘0’, and A8-A23 are ignored. Consider 
the following example. 


82380 


Example 


Relocation Hegster= 10100111 (OA7H) 
The 82380 will respond to memory addresses in 
the range of OA6XXXXOOH to OA6XXXXFFH 


(where ‘X’ is don’t care). 


This scheme implies that the internal register can be 
located in any even, contiguous, 2**24 byte page of 
the memory space. 


8.2 Address Decoding 


As mentioned previously, the 82380 internal regis- 
ters do not occupy the entire contiguous 256 ad- 
dress locations. Some of the locations are ‘unoccu- 
pied’. The 82380 always decodes the lower 8 ad- 
dress bits (AO—A7) to determine if any one of its 
registers is being accessed. If the address does not 
correspond to any of its registers, the 82380 will not 
respond. This allows external devices to be located 
within the ‘holes’ in the 82380 address space. Note 
that there are several unused addresses reserved 
for future me peripheral devices. 


9.0 CPU RESET AND SHUTDOWN 


DETECT 


The 82380 will activate the CPURST signal to reset 
the host processor when one of the following condi- 
tions occurs: 


— 82380 RESET is active; 


— 82380 detects a 80386 Shutdown cycle (this fea- 
ture can be disabled); 


— CPURST software command is issued to 80386. 


Whenever the CPURST signal is activated, the — 
82380 will reset its own mtemal Slave- Bus state ma- 


Chine. 


9.1 Hardware Reset 


Following a hardware reset, the 82380 will assert its 
CPURST output to reset the host processor. This 
output will stay active for as long as the RESET input 
is active. During a hardware reset, the 82380 internal 
registers will be initialized as defined in the corre- 
sponding functional descriptions. 


9.2 Software Reset 


CPURST can be generated by writing the following 
bit pattern into 82380 register location 64H. 


X= - Don’ t Care 
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The Write operation into this port is considered as 
an 82380 access and the internal Wait State Gener- 
ator will automatically determine the required num- 
ber of wait states. The CPURST will be active follow- 
ing the completion of the Write cycle to this port. 
This signal will last for 62 CLK2 periods. The 82380 
should not be accessed until the CPURST is deacti- 
vated. 


This internal port is Write-Only and the 82380 will 
not respond to a Read operation to this location. 
Also, during a CPU software reset command, the 
82380 will reset its Slave-Bus state machine. How- 
ever, its internal registers remain unchanged. This 
allows the operating system to distinguish a ‘warm’ 
reset by reading any 82380 internal register previ- 
ously programmed for a non-default value. The Diag- 
nostic registers can be used or this purpose (see 
Internal Control and Diagnostic Ports). 


9.3 Shutdown Detect 


The 82380 is constantly monitoring the Bus Cycle 
Definition signals (M/IO#, D/C#, R/W#) and is 
able to detect when the 80386 executes a Shutdown 
bus cycle. Upon detection of a processor shutdown, 
the 82380 will activate the CPURST output for 62 
CLK2 periods to reset the host processor. This sig- 
nal is generated after the Shutdown cycle is termi- 
nated by the READY # signal. 


Although the 82380 Wait State Generator will not 
automatically respond to a Shutdown (or Halt) cycle, 
the Wait State Control inputs (WSCO, WSC1) can be 
used to determine the number of wait states in the 
same manner as other non-82380 bus cycle. 


This Shutdown Detect feature can be enabled or dis- 
abled by writing a control bit in the Internal Control 
Port at address 61H (see Internal Control and Diag- 


Port Address: 61H (Write Only) 


SHUTDOWN 


SHUTDOWN 
DETECT GATE 
O= DISABLE 
1— ENABLE 


COUNTER 3 


O=— DISABLE 
1— ENABLE 


82380 


nostic Ports). This feature is disabled upon a hard- 
ware reset of the 82380. As in the case of Software 
Reset, the 82380 will reset its Slave-Bus state ma- 
chine but will not change any of its internal register 
contents. 


10.0 INTERNAL CONTROL AND 
DIAGNOSTIC PORTS 


10.1 Internal Control Port 


The format of the Internal Control Port of the 82380 
is shown in Figure 10.1. This Control Port is used to 
enable/disable the Processor Shutdown Detect 
mechanism as well as controlling the Gate inputs of 
the Timer 2 and 3. Note that this is a Write-Only port. 
Therefore, the 82380 will not respond to a read op- 
eration to this port. Upon hardware reset, this port 
will be cleared; i.e., the Shutdown Detect feature 
and the Gate inputs of Timer 2 and 3 are disabled. 


10.2 Diagnostic Ports 


Two 8-bit read/write Diagnostic Ports are provided 
in the 82380. These are two storage registers and 
have no effect on the operation of the 82380. They 
can be used to store checkpoint data or error codes 
in the power-on sequence and in the diagnostic 
service routines. As mentioned in CPU RESET AND 
SHUTDOWN DETECT section, these Diagnostic 
Ports can be used to distinguish between ‘cold’ and 
‘warm’ reset. Upon hardware reset, both Diagnostic 
Ports are cleared. The address map of these Diag- 
nostic Ports is shown in Figure 10-2. 


a ee 


Diagnostic Port 1 (Read/Write) 80H 
Diagnostic Port 2 (Read/Write) 88H 


Figure 10-2. Address Map of Diagnostic Ports 


D5 D4 D3 D2 OD! 


COUNTER 3 COUNTER 2 
ENABLE/ GATE GATE 
DISABLE INPUT INPUT 


COUNTER 2 
NOT USED GATE 
O= DISABLE 
1 — ENABLE 
290128-93 


Figure 10-1. Internal Control Port 
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11.0 INTEL RESERVED I/O PORTS 


There are eleven I/O ports in the 82380 address 
space which are reserved for Intel future peripheral 
device use only. Their address locations are: 2AH, 
3DH, 3EH, 45H, 46H, 76H, 77H, 7DH, 7EH, CCH 
and CDH. These addresses should not be used in 
the system since the 82380 may respond to read/ 
write operations to these locations and bus conten- 
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tion may occur if any peripheral is assigned to the 
same address location. 


12.0 MECHANICAL DATA 
12.1 Introduction 


In this section, the physical package and its connec- 
tions are described in detail. 


290128~94 


Figure 12.1. 82380 PGA Pinout—View from TOP side 
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12.2 Pin Assignment 


The 82380 pinout as viewed from the top side of the 
component is shown in Figure 12.1. Its pinout as 
viewed from the pin side of the component is shown 
in Figure 12.2. 


82380 


Vcc and GND connections must be made to multi- 
ple Vcc and Vss (GND) pins. Each Vcc and Vss 
MUST be connected to the appropriate voltage lev- 
el. The circuit board should include Vcc and GND 
planes for power distribution and all Vcc pins must 
be connected to the appropriate plane. 
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Figure 12.2. 82380 PGA Pinout—View from PIN side 
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Table 12-1. 82380 PGA Pinout—Functional Grouping 


D31 
D30 
D29 | 
D28 Voc 
D27 Voc 
D26 ~~ Vec 
D25 Voc 
D24 Voc 
D23 Vcc 
D222}, Voc 
D21 | ~ Veco 
D20 | Voc 
D19 | 
D18 . ' CLK2 
D17  D/C# . 
D16 : W/R# 
D15 M/lIO# 
D14 ADS # 
D13 » NA# 
Di2 od HOLD 
D11 HLDA 
D10 ~DREQO 
DO 3 ~ DREQ1 
D8 DREQ2 
D7 DREQ3 
D6 DREQ4/IRQ9 # 
D5 _DREQ5 
D4 — DREQ6 
D3 DREQ7 
D2 | | 
Di EOP # 
DO EDACKO 
EDACK1 
RESET EDACK2 
CPURST 
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Vss 


~Vgs 


Vss 
Vss 
Vss 
Vss 
Vss 


_Vgs 


Vss 
Vss 


IRQ23 # 
IRQ22# 
IRQ21 # 
IRQ20# 
IRQ19# 
IRQI8# 
IRQI7# — 
IRQI6# 
IRQ15# 
IRQI4# 
IRQ13# 
IRQI2# 
IRQ11# 
INT 


CLKIN 


TOUT1/REF # 
TOUT2 # /IRQ3 # 


TOUT3 # 


~ READY # 


READYO # 
WSCO 
WSC1 
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12.3 Package Dimensions and 
Mounting 


The 82380 package is a 132-pin ceramic Pin Grid 
Array (PGA). The pins are arranged 0.100 inch (2.54 
mm) center-to-center, ina 14 x 14 matrix, three rows 
around. 


150 (3.807) 
.250 (6.345) 
.350 (8.883) 
450 (11.421) 


.020 (0.508) 
MIN TYP 


.070 (1.777) DIA 
TYP BRAZE PAD 


1.450 (36.802) 


550 (13.959) 
.650 (16.497) 


A wide variety of available sockets allow low inser- 
tion force or zero insertion force mountings, and a 
choice of terminals such as _ soldertail, surface 
mount, or wire wrap. Several applicable sockets are 
listed in Figure 12-4. 
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Figure 12.3. 132-Pin Ceramic PGA Package Dimensions 
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e Low insertion force (LIF) soldertail — 
55274-1 ns me 
e Amp tests indicate 50% reduction in insertion 
force compared to machined sockets 
Other socket options 
¢ Zero insertion force (ZIF) soldertail 
55583-1 
e Zero insertion force (ZIF) Burn-in version 
55573-2 
Amp Incorporated 
(Harrisburg, PA 17105 U.S.A. 
Phone 717-564-0100) 


290128-97 
Cam handle locks in low profile position when substrate is installed 
(handle UP for open and DOWN for closed positions) 


courtesy Amp Incorporated 


Peel-A-Way™ Mylar and Kapton “Peel-A-Way Carrier No. 132; 
Socket Terminal Carriers Kapton Carrier is KS132 


e Low insertion force surface mount Mylar Carrier is MS132 
CS132-37TG _ Molded Plastic Body KS132 


e Low insertion force soldertail is shown below: 
CS132-01TG 7 . 
¢ Low insertion force wire-wrap 


CS132-02TG (two level) 
CS132-03TG (three-level) 


¢ Low insertion force press-fit 


CS192.05TG 


Advanced Interconnections 
(5 Division Street 
Warwick, Ri 02818 U.S.A. 
Phone 401-885-0485) 


~{  .100 TYP 
14x 14x 3 ROWS 


290128-98 290128-99 


courtesy Advanced Interconnections 
(Peel-A-Way Terminal Carriers 
U.S. Patent No. 4442938) 


Figure 12-4. Several Socket Options for 132-pin PGA 
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e Low insertion force socket soldertail 
(for production use) 
2XX-6576-00-3308 (new style) 
2XX-6003-00-3302 (older style) 

e Zero insertion force soldertail 


(for test and burn-in use) 
2XX-6568-00-3302 


Textool Products 

Electronic Products Division/3m 
(1410 West Pioneer Drive 

Irving, Texas 75601 U.S.A. 


Phone 214-259-2676) 


290128-A0 
courtesy Textoll Products/3M 


Figure 12-4. Several Socket Options for 132-pin PGA (Continued) 


12.4 Package Thermal Specification to determine whether the 82380 is within the speci- 
fied operating range. 


The 82380 is specified for operation when case tem- 

perature is within the range of 0°C — 85°C. The case The PGA case temperature should be measured at 

temperature may be measured in any environment, the center of the top surface opposite the pins, as in 
- Figure 12.5. : 


MEASURE PGA CASE TEMPERATURE 
AT CENTER OF TOP SURFACE 


132— PIN PGA 


290128-A1 


Figure 12.5. Measuring 82380 PGA Case Temperature 
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Thermal eae C/Watt 


- Airflow—f3/min | Airflow—f3/min (m3/sec) ——_—| 
Parameter fem 


200 
1.01 
6 Junction-to-Case 
(case measured 
as Fig. 6.4) 


100 
Fae, 50)/( 


)( 


@ Case-to-Ambient 
(with omnidirectional 
heatsink) 


@§ Case-to-Ambient 
(with unidirectional 
heatsink) 


NOTES: 


400 | 60 
2.03)} (3. 


0 | 80 
04)|(4.0 


290128~A2 


1. Table 12-6 applies to 82380 PGA plugged into socket or soldered — 


directly into board. 
2. Oya = Ojc + Oca. 
3. Ojy.cap = 4°C/W (approx.) 
- Oj.pin = 4°C/W (inner pins) (approx.) 
05-pin = 8°C/W (outer pins) (approx.) 


Figure 12-6. 82380 PGA Package Typical Thermal Characteristics 


13.0 ELECTRICAL DATA 


13.1 Power and Grounding 


The large number of output buffers (address; data 
and control) can cause power surges as multiple 
output buffers drive new signal levels simultaneous- 


ly. The 22 Vcc and Vgs pins of the 82380 each feed — | 


separate functional units to minimize switching in- 
duced noise effects. All Vcc pins of the 82380 must 
be connected on the circuit board. 


13.2 Power Decoupling 


Liberal decoupling capacitance should be placed 
close to the 82380. The 82380 driving its 32-bit par- 
allel address and data buses at high frequencies can 
cause transient power surges when driving large ca- 
pacitive loads. Low inductance capacitors and inter- 


connects are recommended for the best reliability at 
high frequencies. Low inductance capacitors are 
available specifically for Pin Grid Array packages. 


13.3 Unused Pin Recommendations 


For reliable operation, ALWAYS connect unused in- 


puts to a valid logic level. As is the case with most 
other CMOS processes, a floating input will increase — 
the current consumption of the component and give 
an indeterminate state to the component. 


13.4 ICE-386 Support 


The 82380 specifications provide sufficient drive ca- 
pability to support the ICE386. On the pins that are 
generally shared between the 80386 and the 82380, 
the additional loading represented by the -ICE386 
was allowed for in the design of the 82380. 
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13.5 Maximum Ratings only and functional operation at these or any other 
conditions above those listed in the operational 

Storage Temperature .......... — 65°C to + 150°C sections of this specification is not implied. 

Case temperature Under Bias ... —65°C to + 110°C 

Supply Voltage with Respect Exposure to absolute maximum rating conditions for 
LO VisGirccaduenu iene 4 aoue eas —0.5V to +6.5V extended periods may affect device reliability. Al- 
Voltage on any other Pin ..... —0.5V to Voc +0.5V though the 82380 contains protective circuitry to re- 
set damage from static electric discharges, always 
NOTE: take precautions against high static voltages or elec- 
Stress above those listed above may cause perma- tric fields. , | 


nent damage to the device. This is a stress rating 


13.6 D.C. Specifications 
Tcoase = 0°C to 85°C; Voc = 5V +5%; Vss = OV. 
Table 13-1. 


a) 


Input Leakage Current for 
pins: IRQ11#-IRQ23#, 
TOUT2# /IRQ3#, EOP#, DREQ4 


Output Leakage Current ee 


— 300 


300 
325 


mA 
mA 


i eS ———__ 1 na 
CLK2 Input Low Voltage ~0.3 he (Note 1) 
lo. =4mA: A2-A31, 
7 DO-D31 
VOH Output High Voltage 
loH = —1mMA: A2-A31, | 
DO-D31 
| lou = —0.9 mA: All Others — 2 ; | 
Input Leakage Current for 
IRQ11#-IRQ23#, 
TOUT2/IRQ3#, EOP #, DREQ4 — OV<Vin<Voc 
| OV<Vin<Voc 
| pA | 0.45<Vout<Vec | 
CLK2 = 32 MHz 
(Note 4) 
fo = 1 MHz 
(Note 2) 


Output Low Voltage 
0.45 
lo. = 5 mA: All Others 0.45 
all ins except: 
(Note 3) 
= 40 MHz 
pF fo = 1 MHz 
(Note 2) 


Supply Current 
(CAP) Capacitance (Input/IO) 
CCLK CLK2 ereneCtANS. 


NOTES: 

1. Minimum value is not 100% tested. 

2. Sampled only. 

3. These pins have internal pullups on them. 

4. Icc is specified with inputs driven to CMOS levels. Icc may be higher if driven to TTL levels. 


\+ 
ol 
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13.6 D.C. Specifications (Continued) | 
TcasE = 0°C to 85°C; Voc = 5V +5%; Vsg = OV. 


Table 13-2. 82380-25 D.C. Specifications 


[Parameter | Min 
Input Low Voltage | 


Input High Voltage 


CLK2 Input Low Voltage 
CLK2 Input High Voltage 


| 
Output Low Voltage | 
lo. = 4mA: Ag-Ag1, Do—D31 0.45 V 
lo. = 5 mA: All Others 0.45 V -g 
24 | | | 
2.4 | | 


Output High Voltage 
loo = —1 MA: Ag-Agq, Do-Da1 
lon = —0.9 mA: All Others 


Input Leakage Current 

All Inputs except: IRQ11 # - 
IRQ23#, EOP #, TOUT2/IRQ3#, 
DREQ4_~ . 


Input Leakage Current 0<Vin < Voc 
Inputs: IRQ11#-IRQ23#, (Note 3) 
EOP #, TOUT2/IRQ3#, DREQ4 . : 


Output Leakage Current 
Supply Current (CLK2 = 50 MHz) 


Input Capacitance 
CLK2 Input Capacitance 


NOTES: 

1. Minimum value is not 100% tested. 

2. fc = 1 MHz; Sampled only. 

3. These pins have weak internal pullups: They should not be left floating. | 

4. loc is specified with inputs driven to CMOS levels, and outputs driving CMOS loads. Ico may be higher if inputs are driven 
_ to TTL levels, or if outputs are driving TTL loads. 
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13.7 A.C. Specifications 


The A.C. specifications given in the following tables 
consist of output delays and input setup require- 
ments. The A.C. diagram’s purpose is to illustrate 
the clock edges from which the timing parameters 
are measured. The reader should not infer any other 
timing relationships from them. For specific informa- 
tion on timing relationships between signals, refer to 
the appropriate functional section. 


OUTPUTS NOTE 2 | MIN 


(A2-A31,D/C#, VALID. _N AY 
eon wel bine D1 5\ Ni 


RDYO#, LOCK#, HOLD 


OUTPUTS 
(DO-D31) 


INPUTS 
(NA#) 


INPUTS 

(READY#, HLDA, 
A2=A31,DO=D31) 
IRQy#, ADS# 


LEGEND: 
(A|_-maximum output delay spec 
(8\_-minimum output delay spec 
(Cl_eminimum input setup spec 
(0|_minimum input hold spec 


NOTES: 
1. Input waveforms have tr < 2.0 ns from 0.8V to 2.0V. 


82380 


A.C. spec measurement is defined in Figure 13.1. 
Inputs must be driven to the levels shown when A.C. 
specifications are measured. 82380 output delays 
are specified with minimum and maximum limits, 
which are measured as shown. The minimum 82380 
output delay times are hold times for external circuit- 
ry. 82380 input setup and hold times are specified as 
minimums and define the smallest acceptable sam- 
pling window. Within the sampling window, a syn- 
chronous input signal must be stable for correct 
82380 operation. 


1.5V VALID 


OUTPUT n+1 


: = 
VALID , iN \Y 1 5y VALID 
OUTPUT n | A\ A. OUTPUT n+ 


™ WW 


(C)4<——_> 


+++ © 
SAW Ga UI 


290128-B3 


2. Under rated loading (120 pF) 82380 output tr, tf is typically < 4.0 ns from 0.8V to 2.0V. 


Figure 13-1. Drive Levels and Measurement Points for A.C. Specification 
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A.C. SPECIFICATION TABLES 
Functional Operating Range: Vcc = 5V +5%; Tcase = O°C to + 85°C 
Table 13-3. 82380 A.C. —_ | 


Operating Frequency 4MHz | 16MHz | 4MHz | 20MHz | Half CLK2 Peaucney 
CLK2 Period 31 ns ans | 125ns | 
at 2.0V 


— 

ic 

rae | cuKarign tine | 

ra J attvec-0ay 
=e 

re 

rs 


ee 
Eee 
eel 
Lees! 
Pe 
8 


A (2-31), BE (0-3) #, 
EDACK (0-2) . 
Valid Delay 

Float Delay 


CL = 120 pF 
(Note 1) 


, A (2-31), BE (0-3) # 
t8 Setup Time 
to Hold Time 
W/R#, M/IO#, D/C#, | 
t10 Valid Delay ‘CL = 75 pF 
t11 Float Delay - (Note 1) 
t12 Setup Time 
t13 Hold Time 
t14 ADS # Valid Delay 
t15 Float Delay 
t16 Setup Time 
t17 Hold Time 


Slave Mode— 
D(O-31) Read 

Valid Delay 
Float Delay 


CL = 120 pF 
(Note 1) 


QO 
pie | es 
pe fee 


Slave Mode— 

D(O-—31) Write 
- t20 Setup Time 31 
t21 Hold Time 26 
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A.C. SPECIFICATION TABLES (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = O°C to + 85°C. 


Table 13-3. 82380 A.C. Characteristics (Continued) 


Master Mode— 
D(O-31) Write 
{22 Valid Delay 
t23 - Float Delay 
Master Mode— 
D(0-31) Read 
t24 . Setup Time 
t25 Hold Time 6 


t26 READY # Setup Time 

t27 Hold Time 4 
t28 WSC (0-1) Setup 6 
t29 Hold 

t31 RESET Setup Time 

t30 Hold Time 4 


| 132 | READYO# Valid Delay 


Nh @ 


4 CL = 120 pF 


(Note 1) 


- 
BAN 


11 


ah, 


3 


4 
2 1 


16 CL = 50 pF 
CL = 100 pF | 


Synch. EOP 


Asynch. EOP 


5 30 CL = 100 pF (‘1’->‘0’) 


Synchronous DREQ , 


Asynchronous DREQ 


FromtRQInput sy 
CL=75pF . 


13 
CPU Reset From CLK2 
HOLD Valid Delay 

3 


6 


eck G 
a - 
On © 


t36 Hold Time 6 
2 
: 
[a7 | EOP# SetupTine 


2 
3 
4 

1 

EOP # Valid Delay 5 
0 
4 


4 
11 


amok, 
—_h, 


ah oh — _— N —h 


a 
oh 


| 
8 
38 
40 


—_ 
i<e) 


t4a DREQ Setup Time 
t42a Hold Time 4. 
t41b DREQ Setup Time 
t42b Hold Time 
INT Valid Delay 
t4 NA# Setup Time © 
t45 Hold Time _ 


4 


3 
EOP # Float Delay 5 
4 


500 500 


ah eh —h oh ye) 
ao — —_— oh — 


a Sa % a a 9 
oO + 
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A.C. SPECIFICATION TABLES (Continued) 
Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C. 


Table 13-3. 82380 A.C. Characteristics (Continued) 


[as _| GLKINFrequoncy | OMHe | tome | OMMe | ioMHe[ 
ro f 
[ae | CLKINLowTime [so || so | [atv 
: 
ae oa 


eae 
CCLKIN atime | _ 
= 

[3 


TOUT2# Valid Delay 


TOUT2# Float Delay 
| t55 | TOUT3# Valid Delay 


NOTE: | | er 
1. Float condition occurs when the maximum output current becomes less than ILO in magnitude. Float delay is not tested. 
For testing purposes, the float condition occurs when the dynamic output driven voltage changes with current loads. 


From CLKIN, CL = 120 pF 
(Falling Edge Only) 


Min 

a 

ae ee 
ae ae 

| 3 | 3 | 2 
he! 

| 40 | 3 

| 93 | 8 


Functional Operating Range: Vcc = 5V +5%; Tcase = 0°C to + 85°C. 
A.C. timings are tested at 1.5V thresholds; except as noted. 


Table 13-4, 82380-25 A.C. Characteristics 


i Operating Frequency 1/(t1a x 2) 
tt CLK2 Period | | 


CLK2 High Time at 2.0V 
CLK2 High Time at 3.7V 
CLK2 Low Time | : | at 2.0V 
CLK2 Low Time at 0.8V 
CLK2 Fall Time | 3.7V to 0.8V 
CLK2 Rise Time | 0.8V to 3.7V 


A2-A31, BEO#-—-BE3# | | 7 : 50 pF Load 
EDACKO-EDACKS3 Valid Delay . | 
A2-A31, BEO#-BE3# 3 2 50 pF Load 
EDACKO-EDACKS3 Float Delay | 


W/R#, M/IO#, D/C# Valid Delay 
W/R#, M/lO#, D/C# Float Delay 
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A.C. SPECIFICATION TABLES (Continued) 


Functional Operating Range: Vcc = 5V £5%; Tcase = 0°C to + 85°C. 
A.C. timings are tested at 1.5V thresholds; except as noted. 


Table 13-4. 82380-25 A.C. Characteristics (Continued) 


Symbol Parameter e230)? 
t12 W/R#, M/IO#, D/C# Setup Time 
t13 W/R#, M/IO#, D/C# Hold Time 
t14 ADS # Valid Delay 
t15 ADS # Float Delay 

| t16 ADS # Setup Time 

17 ADS # Hold Time | 
t18 Slave Mode DO-D31 Read Valid 
t19 Slave Mode D0O-D31 Read Float 
t20 Slave Mode DO-D31 Write Setup 
t21 Slave Mode DO-D31 Write Hold 
t22 Master Mode DO-D31 Write Valid 
t23 Master Mode DO-D31 Write Float 
t24 _ Master Mode DO—D31 Read Setup 
t25 Master Mode D0-D31 Read Hold 
t26 READY # Setup Time 
t27 READY # Hold Time 
t28 WSCO0O-WSC1 Setup Time 
t29 WSCO0O-WSC1 Hold Time 
t30 RESET Hold Time . 
31 RESET Setup Time | 
READYO # Valid Delay 


CPURST Valid Delay 
HOLD Valid Delay _ 


t35 HLDA Setup Time . 

t36 HLDA Hold Time 

t37a EOP # Setup (Synchronous) 

(38a EOP # Hold (Synchronous) 4 
t87b EOP # Setup (Asynchronous) 10 
t38b EOP # Hold (Asynchronous) 

t39 EOP # Valid Delay 4 
t40 EOP # Float Delay 4 
t41a DREQ Setup (Synchronous) 

t42a DREQ Hold (Synchronous) 4 
t41b DREQ Setup (Asynchronous) 10 
t42b DREQ Hold (Asynchronous) 


INT Valid Delay from IRQn 


50 pF Load 
50 pF Load 
50 pF Load 
50 pF Load 
50 pF Load 
50 pF Load 


G Nh — 
—_ Oo oO 


— NO PO 
On — 


ns 
ns 


= NO ND “oak 


LP PVP OOH] EY@®)FO!;AN 


_ 25 pF Load 


50 pF Load 
50 pF Load 


50 pF Load 
50 pF Load 


50 pF Load 


MN 
- 


aw) 
N 


—h —h 
nalos 


—. _ 
a | © 


= oh = 


500 


mk 
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A.C. SPECIFICATION TABLES (Continued) 


Functional Operating Range: Vcc = 5V +5%; Tcase = O0°C to + 85°C. 
A.C. timings are tested at 1.5V thresholds; except as noted. 


Table 13-4. 82380-25 A.C. Characteristics aa 


NA# Setup Time 
NA# Hold Time 


CLKIN Frequency 
CLKIN High Time 
~ CLKIN Low Time | 
CLKIN Rise Time 0. 8V to 3.7V 
CLKIN Fall Time 3.7V to 0.8V 


TOUT1/REF # Valid Delay 
from CLK2 (Refresh) 50 pF Load 
from CLKIN (Timer) 50 pF Load 


TOUT2# Valid Delay 50 pF Load 
(Falling Edge Only) 
TOUT2# Float Delay 50 pF Load 


TOUT3 # Valid Delay 50 pF Load 


82380 
OUTPUT 


290128-—-A4 


Figure 13-2. A.C. Test Load 


290128-A5 


Figure 13-3. CLK2 Timing 
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INPUT SET- UP AND HOLD TIMING (CONT.) 


PHi2 


WSC(0 - 1) 


CLK2 


+ 
fo) 
+ 
o 


A(2-31), BE(O—3)# 


W/R#, M/lO#, D/C# 


= = 
jai 


READY# 


HLDA | 


D(O ~ 31) (DMA Read) 


D(O - 31) (CPU Write) 


a ae 
== 
= 
a 
Sa 
== 
== 


DREQ(0 - 7) 


290128-A6 


Figure 13-4. Input Setup and Hold Timing 


CLK2 


T33 MIN. 
CPURST Ea 


le— T33 MAX. seaiaune 


Figure 13-5. Reset Timing 
5-1193 


CLK2 


A(2- 31), BE(O- 3)# 
VALID DELAY 


A(2= 31), BE(O=3)# 
EDACK(0 = 2) 
VALID DELAY 


A(2= 31), BE(O=3)# 
EDACK(0 = 2) 
FLOAT DELAY 


ADS# 
VALID DELAY 


ADS# _ 


VALID DELAY 


ADS# 
FLOAT DELAY 


HOLD 


CLK2 


D(0 - 31) (CPU Read) = 


D(O= 31) (Dma Write) 
D(O- 31) (Dma Write) 


D(0= 31) (Dma Write) 


82380 © 


t 
PHI 1 PHI 2 


| Tein | 
XXX 


T14Max 


T14Max 


Figure 13-6. Address Output Delays 
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Figure 13-7. Data Bus Output Delays 
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PHI 1 
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T10Min 
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W/R#, M/10#,D/C# 


T10Max 
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T11Max 
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ZS 
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W/R#, M/10#,D/C# 
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READYO# 
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T40Max 
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290128-B0 
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TOUT2# 


TOUT3# 
T55Max 


290128-B1 


Figure 13-9. Timer Output Delays 
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Port Address (HEX) 


00 


82380 


APPENDIX A 


Ports Listed by Address 


Description 


Read/Write DMA Channel 0 Target Address, AO—-A15 
Read/Write DMA Channel 0 Byte Count, BO-B15 
Read/Write DMA Channel 1 Target Address, AO—A15 


Read/Write DMA Channel 1 Byte Count, BO-B15 


Read/Write DMA Channel 2 Target Address, AO—A15 
Read/Write DMA Channel 2 Byte Count, BO-B15 
Read/Write DMA Channel 3 Target Address, AO-A15. 


~ Read/Write DMA Channel 3 Byte Count, BO-B15 


Read/Write DMA Channel 0-3 Status/Command | Register 
Read/Write DMA Channel 0-3 Software Request Register 
Write DMA Channel 0-3 Set-Reset Mask Register 

Write DMA Channel 0-3 Mode Register |» 


_ Write Clear Byte-Pointer FF 


Write DMA Master-Clear 
Write DMA Channel 0-3 Clear Mask pi edisiet 


_ Read/Write DMA Channel 0-3 Mask Register 


Read/Write DMA Channel 0 Target Address, A24—A31 

Read/Write DMA Channel 0 Byte Count, B16-B23 

Read/Write DMA Channel 1 Target Address, A24-A31 

Read/Write DMA Channel 1 Byte Count, B16-—B23 

Read/Write DMA Channel 2 Target Address, A24-A31 

Read/Write DMA Channel 2 Byte Count, B16-B23 

Read/Write DMA Channel 3 Target Address, A24—A31 

Read/Write DMA Channel 3 Byte Count, B16-—B23 

Write DMA Channel 0-3 Bus Size Register 

Read/Write DMA Channel 0-3 Chaining Register 

Write DMA Channel 0-3 Command Register II 

Write DMA Channel 0-3 Mode Register II 

Read/Write Refresh Control Register 

Reset Software Request Interrupt. 

Write Bank B IGW1, OCW2, or OCW3 

Read Bank B Poll, Interrupt neadest or In-Service 
Status Register 


~ Write Bank B I[CW2, ICW3, ICW4 or OCW1 


Read Bank B Interrupt Mask Register _ 
Read Bank B ICW2 

Read/Write IRQ8 Vector Register 
Read/Write IRQ9 Vector Register 
Reserved 

Read/Write IRQ11 Vector Register 
Read/Write IRQ12 Vector Register 
Read/Write IRQ13 Vector Register 
Read/Write IRQ14 Vector Register 
Read/Write IRQ15 Vector Register 
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APPENDIX A—Ports Listed by Address (Continued) 


Port Address (HEX) Description 
30 Write Bank A ICW1, OCW2 or OCW3 
Read Bank A Poll, Interrupt Request or In-Service 
Status Register 
31 Write Bank A ICW2, ICW3, ICW4 or OCW1 
Read Bank A Interrupt Mask Register 
32 _ Read Bank A ICW2 
38 Read/Write IRQO Vector Register 
39 Read/Write IRQ1 Vector Register 
3A Read/Write IRQ1.5 Vector Register 
3B Read/Write IRQ3 Vector Register 
3C Read/Write IRQ4 Vector Register 
3D Reserved 
3E Reserved 
3F Read/Write IRQ7 Vector Register 
40 Read/Write Counter 0 Register 
«At Read/Write Counter 1 Register 
42 Read/Write Counter 2 Register 
43 | Write Control Word Register |—Counter 0, 1, 2 
44 Read/Write Counter 3 Register 
45 Reserved 
46 Reserved 
47 Write Word Register Il—Counter 3 
61 Write Internal Control Port 
64 Write CPU Reset Register (Data-1111XXX0OH) 
72 Read/Write Wait State Register 0 | 
73 Read/Write Wait State Register 1 
74 Read/Write Wait State Register 2 
75 Read/Write Refresh Wait State Register 
76 Reserved 
77 Reserved 
7D Reserved 
7E Reserved 
7F Read/Write Relocation Register 
80 Read/Write Internal Diagnostic Port 0 
81 Read/Write DMA Channel 2 Target Address, A16—A23 
82 Read/Write DMA Channel 3 Target Address, A16-A23 
83 ! Read/Write DMA Channel 1 Target Address, A16—A23 
87 Read/Write DMA Channel 0 Target Address, A16-A23 
88 , Read/Write Internal Diagnostic Port 1 
89 : Read/Write DMA Channel 6 Target Address, A16—A23 
8A Read/Write DMA Channel 7 Target Address, A16-A23 
8B Read/Write DMA Channel 5 Target Address, A16—A23 
8F Read/Write DMA Channel 4 Target Address, A16-A23 
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APPENDIX A—Ports Listed by Address (Continued) | 


Port Address (HEX) | 


90 
91 
— 92 
93 
94 
95 
96 
97 
98 
99 
9A 
9B 
9C 
9D 
9E 
OF 
AO 


Al 


Description | 


Read/Write DMA Channel 0 Requester Address, AO-—A15 
Read/Write DMA Channel 0 Requester Address, A16—A31. 
Read/Write DMA Channel 1 Requester Address, AO—A15 
Read/Write DMA Channel 1 Requester Address, A16-A31 
Read/Write DMA Channel 2 Requester Address, AO-A15 
Read/Write DMA ‘Channel 2 Requester Address, A16-A31 
Read/Write DMA Channel 3 Requester Address, AO-A15 
Read/Write DMA Channel 3 Requester Address, A16-—A31 
Read/Write DMA Channel 4 Requester Address, AO-A15 


~ Read/Write DMA Channel 4 Requester Address, A16-A31 


Read/Write DMA Channel 5 Requester Address, AO—A15 
Read/Write DMA Channel 5 Requester Address, A16—A31 
Read/Write DMA Channel 6 Requester Address, AO—A15 
Read/Write DMA Channel 6 Requester Address, A16-—A31 
Read/Write DMA Channel 7 Requester Address, AO—A15 
Read/Write DMA Channel 7 Requester Address, A16-A31 
Write Bank C ICW1, OCW2 or OCW3 | 
Read Bank C Poll, Interrupt Request or In- Service 

Status Register 
Write Bank C ICW2, ICW3, ICW4 or OCW1 
Read Bank C Interrupt Mask Register 
Read Bank C ICW2 
Read/Write IRQ16 Vector Register 
Read/Write IRQ17 Vector Register 


Read/Write IRQ18 Vector Register 
_ Read/Write IRQ19 Vector Register 


Read/Write IRQ20 Vector Register 


_ Read/Write IRQ21 Vector Register 


Read/Write IRQ22 Vector Register 

Read/Write IRQ23 Vector Register 

Read/Write DMA Channel 4 Target Address, AO-A15 
Read/Write DMA Channel 4 Byte Count, BO—B15 
Read/Write DMA Channel 5 Target Address, AO—A15 
Read/Write DMA Channel 5 Byte Count, BO-B15 


- Read/Write DMA Channel 6 Target Address, AO-A15 


Read/Write DMA Channel 6 Byte Count, BO-B15 


Read/Write DMA Channel 7 Target Address, AO-A15 


Read/Write DMA Channel 7 Byte Count, BO-B15 

Read DMA Channel 4-7 Status/Command | Register 
Read/Write DMA Channel 4—7 Software Request Register 
Write DMA Channel 4—7 Set—Reset Mask Register 

Write DMA Channel 4-7 Mode Register | | 

Reserved 

Reserved 

Write DMA Channel 4—7 Clear Mask Register 

Read/Write DMA Channel 4—7 Mask Register 
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APPENDIX A—Ports Listed by Address (Continued) 


Port Address (HEX) | Description 

DO Read/Write DMA Channel 4 Target Address, A24-—A31 
D1 Read/Write DMA Channel 4 Byte Count, B16-—B23 
D2 Read/Write DMA Channel 5 Target Address, A24-—A31 
D3 Read/Write DMA Channel 5 Byte Count, B16-B23 

. D4 Read/Write DMA Channel 6 Target Address, A24—A31 
D5 Read/Write DMA Channel 6 Byte Count, B16-—B23 
D6 Read/Write DMA Channel 7 Target Address, A24—A31 
D7 Read/Write DMA Channel 7 Byte Count, B16-—B23 
D8 Write DMA Channel 4-7 Bus Size Register 
D9 Read/Write DMA Channel 4-7 Chaining Register 
DA Write DMA Channel 4—7 Command Register II 


DB Write DMA Channel 4—7 Mode Register II 
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PortAddress(HEX) = 
~ MA. CONTROLLER 
_. .Write DMA Master-Clear 

- . . Write DMA Clear Byte-Pointer FF 


__: Read/Write DMA Channel 0-3 Status/Command | Register 
~ Read/Write DMA Channel 4-7 Status/Command | Register 


0D 


0C 


08° 


C8 
1A 
DA 
0B 
CB 
1B 
DB 


09 


82380 


APPENDIX B 
_ Ports Listed by Function 


Description | 


Write DMA Channel 0-3 Command Register II 
Write DMA Channel 4—7 Command Register II 


Write DMA Channel 0-3 Mode Register | 
Write DMA Channel 4—7 Mode Register | 
Write DMA Channel 0-3 Mode Register II 
Write DMA Channel 4—7 Mode Register II 


Read/Write DMA Channel 0-3 Software Request Register 
Read/Write DMA Channel 4-7 Software Request Register 
Reset Software Request Interrupt | 


Write DMA Channel 0-3 Clear Mask Register 
Write DMA Channel 4—7 Clear Mask Register 
Read/Write DMA Channel 0-3 Mask Register 
Read/Write DMA Channel 4-7 Mask Register 
Write DMA Channel 0-3 Set-Reset Mask Register 
Write DMA Channel 4-7 Set-Reset Mask Register 


Write DMA Channel 0-3 Bus Size Register 
Write DMA Channel 4-7 Bus Size Register 


Read/Write DMA Channel 0-3 Chaining Register 
Read/Write DMA Channel 4-7 Chaining Register 


Read/Write DMA Channel 0 Target Address, AO-—A15 
Read/Write DMA Channel 0 Target Address, A16—A23 
Read/Write DMA Channel 0 Target Address, A24-A31 
Read/Write DMA Channel 0 Byte Count, BO-—B15 
Read/Write DMA Channel 0 Byte Count, B16-B23 
Read/Write DMA Channel 0 Requester Address, AO-A15 
Read/Write DMA Channel 0 Requester Address, A16-A31 


Read/Write DMA Channel 1 Target Address, AO-A15 
Read/Write DMA Channel 1 Target Address, A16-—A23 
Read/Write DMA Channel 1 Target Address, A24-A31 
Read/Write DMA Channel 1 Byte Count, BO-B15 
Read/Write DMA Channel 1 Byte Count, B16—B23 
Read/Write DMA Channel 1 Requester Address, AO—A15 
Read/Write DMA Channel 1 Requester Address, A16-A31 
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APPENDIX B—Ports Listed by Function (Continued) 


Port Address (HEX) 


04 
81 
14 
05 
15 


Description 
DMA CONTROLLER > 


Read/Write DMA Channel 2 Target Address, AO-A15 
Read/Write DMA Channel 2 Target Address, A16-A23 
Read/Write DMA Channel 2 Target Address, A24-A31 
Read/Write DMA Channel 2 Byte Count, BO-B15 
Read/Write DMA Channel 2 Byte Count, B16—B23 
Read/Write DMA Channel 2 Requester Address, AO-A15 
Read/Write DMA Channel 2 Requester Address, A16—A31 


Read/Write DMA Channel 3 Target Address, AO-A15 © 
Read/Write DMA Channel 3 Target Address, A16-A23 
Read/Write DMA Channel 3 Target Address, A24-A31 
Read/Write DMA Channel 3 Byte Count, BO-B15 
Read/Write DMA Channel 3 Byte Count, B16-—B23 
Read/Write DMA Channel 3 Requester Address, AO—A15 
Read/Write DMA Channel 3 Requester Address, A16—A31 


Read/Write DMA Channel 4 Target Address, AO-A15 
Read/Write DMA Channel 4 Target Address, A16-—A23 
Read/Write DMA Channel 4 Target Address, A24—A31 | 
Read/Write DMA Channel 4 Byte Count, BO-B15 


~ Read/Write DMA Channel 4 Byte Count, B16—B23 


Read/Write DMA Channel 4 Requester Address, AO—-A15 
Read/Write DMA Channel 4 Requester Address, A16-—A31 


Read/Write DMA Channel 5 Target Address, AO-A15 
Read/Write DMA Channel 5 Target Address, A16-A23 
Read/Write DMA Channel 5 Target Address, A24-A31 
Read/Write DMA Channel 5 Byte Count, BO-B15 | 
Read/Write DMA Channel 5 Byte Count, B16-B23 
Read/Write DMA Channel 5 Requester Address, AOQ-A15 
Read/Write DMA Channel 5 Requester Address, A16—A31 


Read/Write DMA Channel 6 Target Address, AO-A15 
Read/Write DMA Channel 6 Target Address, A16-A23 
Read/Write DMA Channel 6 Target Address, A24-A31 
Read/Write DMA Channel 6 Byte Count, BO—B15 
Read/Write DMA Channel 6 Byte Count, B16-B23 
Read/Write DMA Channel 6 Requester Address, AO—A15 
Read/Write DMA Channel 6 Requester Address, A16-A31 


Read/Write DMA Channel 7 Target Address, AO-A15 
Read/Write DMA Channel 7 Target Address, A16-A23 
Read/Write DMA Channel 7 Target Address, A24—A31 
Read/Write DMA Channel 7 Byte Count, BO-B15 
Read/Write DMA Channel 7 Byte Count, B16-—B23 
Read/Write DMA Channel 7 Requester Address, AO—A15 
Read/Write DMA Channel 7 Requester Address, A16-A31 
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APPENDIX B—Ports Listed by Function (Continued) 


Port Address (HEX) Description 
| INTERRUPT CONTROLLER 
20 Write Bank B ICW1, OCW2, or OCW3 
Read Bank B Poll, Interrupt Request or In-Service 
a ae ~ Status Register 
21 | Write Bank B ICW2, ICW3, ICW4 or OCW1 
| Read Bank B Interrupt Mask Register 
220 Read Bank B ICW2 
28 Read/Write IRQ8 Vector Register 
29 Read/Write IRQ9 Vector Register 
2A Reserved 
2B Read/Write |IRQ11 Vector Register 
2C Read/Write IRQ12 Vector Register 
2D ~ Read/Write IRQ13 Vector Register 
2E Read/Write IRQ14 Vector Register 
2F - ~ Read/Write IRQ15 Vector Register 
AO | Write Bank C ICW1, OCW2 or OCW3 
3 Read Bank C Poll, Interrupt Request or In-Service 
Status Register 
A1 Write Bank C ICW2, ICW3, ICW4 or OCW1 
Read Bank C Interrupt Mask Register | 
A2 | Read Bank C ICW2 
A8 Read/Write IRQ16 Vector Register 
AQ | _ Read/Write |RQ17 Vector Register 
AA Read/Write |RQ18 Vector Register 
AB. Read/Write IRQ19 Vector Register 
AC Read/Write IRQ20 Vector Register 
AD _ Read/Write IRQ21 Vector Register 
AE Read/Write IRQ22 Vector Register 
AF | - Read/Write IRQ23 Vector Register 
30 Write Bank A ICW1, OCW2 or OCW3 | 
Read Bank A Poll, Interrupt Request oor In-Service 
| Status Register 
31 Write Bank A ICW2, ICW3, ICW4 or OCW1 
| Read Bank A Interrupt Mask Register 
32 Read Bank A ICW2 
38 Read/Write IRQO Vector Register 
39 Read/Write IRQ1 Vector Register 
3A Read/Write IRQ1.5 Vector Register 
3B Read/Write IRQ3 Vector Register 
3C Read/Write IRQ4 Vector Register 
3D | | Reserved 
3E | Reserved 
3F Read/Write IRQ7 Vector Register 
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APPENDIX B—Ports Listed by Function (Continued) 
Port Address (HEX) Description 


40 
41 
42 
43 
44 
47 


64 


72 
73 
74 
75 


1C 


61 


80 
88 


PROGRAMMABLE INTERVAL TIMER 


Read/Write Counter 0 Register 

Read/Write Counter 1 Register 

Read/Write Counter 2 Register 

Write Control Word Register |—Counter 0, 1, 2 
Read/Write Counter 3 Register 

Write Word Register !i—Counter 3 


CPU RESET 
Write CPU Reset Register (Data-1111XXX0H) 


WAIT STATE GENERATOR 


Read/Write Wait State Register 0 
Read/Write Wait State Register 1 
Read/Write Wait State Register 2 
Read/Write Refresh Wait State Register 


DRAM REFRESH CONTROLLER 
Read/Write Refresh Control Register 
INTERNAL CONTROL AND DIAGNOSTIC PORTS 


Write Internal Control Port a 
Read/Write Internal Diagnostic Port 0 
Read/Write Internal Diagnostic Port 1 | 


RELOCATION REGISTER 
Read/Write Relocation Register 
INTEL RESERVED PORTS 


Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
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APPENDIX C 
Pin Descriptions 


The 82380 provides all of the signals necessary to _ 


interface it to an 80386 processor. It has separate 


32-bit address and data buses. It also has a set of 
control signals to support operation as a bus master. © 
or a bus slave. Several special function signals exist. 


on the 82380 for interfacing the system support pe- 
ripherals to their respective system counterparts. 
Following are the definitions of the individual pins of 
the 82380. These brief descriptions are provided as 


a reference. Each signal is further defined within the 


sections which describe the associated 82380 func- 
tion. Bo eo 


A2-A31 1/0 ADDRESS BUS 


This is the 32-bit address bus. The addresses are | 


doubleword memory and |/O addresses. These are 


three-state signals which are active only during Mas- 
ter mode. The address lines should be connected - 


directly to the 80386’s local bus. ~ 
BEO# 1/O BYTE-ENABLE 0 


BEO# active indicates that data bits DO-D7 are be- 


ae ing accessed or are valid. It is connected directly to — 
the 80386’s BEO#. The byte enable signals are ac- . 


tive outputs when the 82380 is in the Master mode. 


BE1# I/O BYTE-ENABLE 1 | 
BE1# active indicates that data bits D8-D15 are 
being accessed or are valid. It is connected directly 
to the 80386’s BE1#. The byte enable signals are 
active only when the 82380 is in the Master mode. 
BE2# Ve) BYTE-ENABLE 2 

BE2# active indicates that data bits D15-D23 are 
being accessed or are valid. It is connected directly 
to the 80386’s BE2#. The byte enable signals are 
active only when the 82380 is in the Master mode. 
BE3# 1/O BYTE-ENABLE 3 

BE3# active indicates that data bits D24-D31 are 
being accessed or are valid. The byte enable signals 
are active only when the 82380 is in the Master 
mode. This pin should be connected directly to the 
80386’s BE3#. This pin is used for factory testing 
and must be low during reset. The 80386 drives 
BE3# low during reset. | 


- DO-D31 I/O 


D/C# I/O 


-M/lO# 1/0 


DATA. BUS © 


This is the 32-bit data bus. These pins are active 
outputs during interrupt acknowledges, during Slave 
accesses, and when the 82380 is in the Master 
mode. | 
CLK2 PROCESSOR CLOCK 

This pin must be connected to CLK2. The 82380 
monitors the phase of this clock in order to remain 
synchronized with the 80386. This clock drives all of 


the internal synchronous circuitry. 


DATA/CONTROL 


-D/C# is used to distinguish between 80386 control 


cycles and DMA or 80386 data access cycles. It is 
active as an output only in the Master mode. 
W/R# ‘I/O WRITE/READ 

W/R # is used to distinguish between write and read 


cycles. It is active as an output only in the Master 
mode. 


MEMORY /IO 


M/lO# is used to distinguish between memory and 
lO accesses. It is active as an output only in the 
Master mode. 
ADS # ie) ADDRESS STATUS 

This signal indicates presence of a valid address on 
the address bus. It is active as output only in the 
Master mode. ADS # is active during the first T-state 
where addresses and control signals are valid. 
NA# | NEXT ADDRESS 

Asserted by a peripheral or memory to begin a pipe- 
lined address cycle. This pin is monitored only while 
the 82380 is in the Master mode. In the Slave mode, 


pipelining is determined by the current and past 
status of the ADS# and READY #¥ signals. 
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HOLD O HOLD REQUEST 

This is an active-high signal to the 80386 to request 
control of the system bus. When control is granted, 
the 80386 activates the hold acknowledge signal 
(HLDA). 
HLDA | HOLD ACKNOWLEDGE 

This input signal tells the DMA controller that the 
80386 has relinquished control of the system bus to 
the DMA controller. 
DREQ (0-3, 5-7) | DMA REQUEST 
The DMA Request inputs monitor requests from pe- 
ripherals requiring DMA service. Each of the eight 
DMA channels has one DREQ input. These active- 
high inputs are internally synchronized and priori- 
tized. Upon reset, channel 0 has the highest priority 
and channel 7 the lowest. 


DREQ4/IRQ9 # | 
QUEST 


DMA/INTERRUPT ~~ RE- 


This is the DMA request input for channel 4. It is also 
connected to the interrupt controller via interrupt re- 
quest 9. This internal connection is available for 
DMA channel 4 only. The interrupt input is active low 
and can be programmed as either edge of level trig- 
gered. Either function can be masked by the appro- 
priate mask register. Priorities of the DMA channel 
and the interrupt request are not related but follow 
the rules of the individual controllers. 


Note that this pin has a weak internal pull-up. This 
causes the interrupt request to be inactive, but the 
DMA request will be active if there is no external 


connection made. Most applications will require that _. 


either one or the other of these functions be used, 
but not both. For this reason, it is advised that DMA 
channel 4 be used for transfers where a software 
request is more appropriate (such as memory-to- 
memory transfers). In such an application, DREQ4 
can be masked by software, freeing IRQ9# for other 
purposes. 

EOP # 1/0 END OF PROCESS 

As an output, this signal indicates that the current 
Requester access is the last access of the currently 
operating DMA channel. It is activated when Termi- 
nal Count is reached. As an input, it signals the DMA 
channel to terminate the current buffer and proceed 
to the next buffer, if one is available. This signal may 
- be programmed as an asynchronous or synchro- 
nous input. 


82380 


EOP # must be connected to a pull-up resistor. This 
will prevent erroneous external requests for termina- 
tion of a DMA process. 

EDACK (0-2) O ENCODED DMA ACKNOWL- 
EDGE 


These signals contain the encoded acknowledge- 
ment of a request for DMA service by a peripheral. 
The binary code formed by the three signals indi- 
cates which channel is active. Channel 4 does not 
have a DMA acknowledge. The inactive state is indi- 
cated by the code 100. During a Requester access, 
EDACK presents the code for the active DMA chan- 
nel. During a Target access, EDACK presents the 


inactive code 100. 


IRQ (11-23) # | INTERRUPT REQUEST 


These are active low interrupt request inputs. The 
inputs can be programmed to be edge or level sensi- 
tive. Interrupt priorities are programmable as either 
fixed or rotating. These inputs have weak internal 
pull-up resistors. Unused interrupt request inputs 
should be tied inactive externally. 


INT O INTERRUPT OUT 
— INT ‘signals the 80386 that an interrupt request is 
pending. 
CLKIN | TIMER CLOCK INPUT 


This is the clock input signal to all of the 82380’s 


programmable timers. It is independent of the sys- 


tem clock input (CLK2). 
TOUT1/REF# O- TIMER 1 OUTPUT/REFRESH 


This pin is software programmable as either the di- 
rect output of Timer 1, or as the indicator of a refresh 
cycle in progress. As REF #, this signal is active dur- 
ing the memory read cycle which occurs during re- 
fresh. 


TOUT2#/IRQ3# I/O TIMER 2 OUTPUT/IN- 


TERRUPT REQUEST3 


This is the inverted output of Timer 2. It is also con- 
nected directly to interrupt request 3. External hard- 
ware can use IRQ3# if Timer 2 is programmed as 
OUT =0 (TOUT2# = 1) 


TOUT3 # O TIMER 3 OUTPUT 


This is the inverted output of Timer 3. 


5-1205 


rte 


READY # i READY INPUT 


This active- low input indicates to the 82380 that the 


_ current bus cycle is complete. READY is sampled by 
_the 82380 both while it is in the Master mode, and 
while it is in the Slave mode. 
WSC (0-1) © | = WAIT STATE CONTROL 
WSCO AND WSC1 are inputs used by the Wait-State 
Generator to determine the number of wait states 
required by the currently accessed memory or I/O. 
The binary code on these ins, combined with the M/ 
lO# signal, selects an internal register in which a 
wait-state count is stored. The combination WSC = 
11 disables the wait-state generator. 
READYO# O READY OUTPUT 
This is the synchronized output of the wait-state 
generator. It is also valid during 80386 accesses to 
the 82380 in the Slave Mode when the 82380 re- 
quires wait states. READYO# should feed directly 
the 80386’s READY # np 


82380 _ 


RESET | RESET 

This synchronous input serves to initialize the state 
of the 82380 and provides basis for the CPURST 
output. RESET must be held active for at least 15 
CLK2 cycles in order to guarantee the state of the 
82380. After Reset, the 82380 is in the Slave mode 
with all outputs except timers and interrupts in their 
inactive states. The state of the timers and interrupt 
controller must be initialized through software. This 
input must be active for the entire time required by 
the 80386 to guarantee proper reset. 

CPURST O CPU RESET 

CPURST provides a synchronized reset signal for 
the CPU. It is activated in the event of a software 
reset command, an 80386 shut-down detect, or a 
hardware reset via the RESET pin. The 82380 holds 
CPURST active for 62 clocks in response to either a 
software reset command or a shut-down detection. 


| Otherwise CPURST reflects the RESET input. 


Vcc + 5V input power 
Vss ‘Ground 


Table C-1. Wait-State Select Inputs 


Port : 
Address D4 
~ Memory 0 


Memory 1 
. Memory 2 


Wait-State Registers 


Select Inputs 
WSC1 


D3 — DOd- WSCO 


DISABLED 
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APPENDIX D 
82380 System Notes 


82380 TIMER UNIT SYSTEM NOTES 
The 82380 DMA controller with Integrated System 


Peripherals is functionally inconsistent with the data 


sheet. This document explains the behavior of the 
82380 Timer Unit and outlines subsequent limita- 
tions of the timer unit. This document also provides 
recommended workarounds. . 


1.0 WRITE CYCLES TO THE 82380 
TIMER UNIT | 


This errata applies only to SLAVE WRITE cycles to © 


the 82380 timer unit. During these cycles, the data 
being written into the 82380 timer unit may be cor- 
rupted if CLKIN is not inhibited during a certain ‘‘win- 
dow”’ of the write cycle. 


1.1 Description 


Please refer to Figure 1. 


During write cycles to the 82380 timer unit, the 
82380 translates the 386DX interface signals such 
as ADS#, W/R#, M/IO#, and D/C# into several 
internal signals that control the operation of the in- 
ternal sub-blocks (e.g., Timer Unit). 


The 82380 timer unit is controlled by such internal 
signals. These internal signals are generated and 
sampled with respect to two separate clock signals: 
CLK2 (the system clock) and CLKIN (the 82380 tim- 
er unit clock). 


Since the CLKIN and CLK2 clock signals are used 
internally to generate control signals for the inter- 
face to the timer unit, some timing parameters must 
be met in order for the interface logic to function 
properly. 


Those timing parameters are met by inhibiting the 
CLKIN signal for a specific window during Write Cy- 
cles to the 82380 Timer Unit.. 


The CLKIN signal must be inhibited using external 
logic, as the GATE function of the 82380 timer unit is 
not guaranteed to totally inhibit CLKIN. 


1.2 Consequences 


This CLKIN inhibit circuitry guarantees proper write 
cycles to the 82380 timer unit. 


Without this solution, write cycles to the 82380 timer 
unit could place corrupted data into the timer unit 
registers. This, in turn, could yield inaccurate results 
and improper timer operation. 


The proposed solution would involve a hardware 
modification for existing systems. 


1.3 Solution 


A timing waveform (Figure 2) shows the specific win- 
dow during which CLKIN must be inhibited. Please 
note that CLKIN must only be inhibited during the 
window shown in Figure 2. This window is defined by 
two AC timing parameters: | | 


tg = 9ns 
= 28ns 


The proposed solution provides a certain amount of 
system ¢ ‘guardband”’ to make sure that this window 
is avoided. 


PAL equations for a suggested workaround are also 
included. Please refer to the comments in the PAL 
codes for stated assumptions of this particular work- 


around. A state diagram (Figure 3) is provided to 


help clarify how this PAL is designed. 


Figure 4 shows how this PAL would fit into a system 
workaround. In order to show the effect of this work- 
around on the CLKIN signal, Figure 5 shows how 


_CLKIN is inhibited. Note that you must still meet the 


CLKIN AC timing parameters (e.g., t47 (min), tag 
(min)) in order for the timer unit to function properly. 


Please note that this workaround has not been test- 


ed. It is provided as a suggested solution. Actual 
solutions will vary from system to system. 


1.4 Long Term Plans 


Intel has no plans to fix this behavior in the 82380 


timer unit. 


5-1207 


intel 7 82380 


module Timer_82380_Fix 
flag ‘'-r2','-q2','=fl', '=t4', '-wl,3,6,5,4,16,7,12,17,18,15,14' 
title '82380 Timer Unit CLKIN 
INHIBIT signal PAL Solution ' 
Timer_Unit_Fix device 'P16R6'; 


"This PAL inhibits the CLKIN signal (that comes from an oscillator) 
"during Slave Writes to the 82580 Timer unit. 
A ; 


- "ASSUMPTION: This PAL assumes that an external system address 
no - decoder provides a Signal to indicate that an 82380 

Timer Unit access is taking place. This input 
Signal is called TMR in this PAL. This PAL also 
assumes that this IMR signal occurs during a | 
Specific T-State. Please see Figure 3 of this 

~ document to see when this signal is expected to 
be active by this PAL. 


"NOTE: This PAL does not Support pipelined 82380 SLAV 
. | cycles. | 


"(c) Intel Corporation 1989. This PAL is provided as a proposed 
"method of solving a certain 82380 Timer Unit problem. This PAL 
"has not been tested or validated. Please validate this solution 


"for your system and application. 
: 3 
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"Input Pins" 


CLK2 pin 
RESET pin 
TMR pin 
IRDY pin 
!ADS pin 
CLK pin 
W_R pin 
nel pin 
nce3 pin 
GNDa pin 
GNDb pin 
CLKIN_IN pin 
"Output Pins" 

Q_O pin 
CLKIN_OUT pin 
INHIBIT pin 

SO pin 

Sl pin 
"Declarations" 
Valid_ADS = ADS & CLK 
Valid_RDY = RDY & CLK 
Timer_Acc = TMR & CLK 


State_Diagram [INHIBIT, Sl, 


State 000: 


State OO1: 


state 010: 


State 110: 


82380 


; "System Clock 
; "Microprocessor RESET signal 
; "Input from Address Decoder, indicating 


"an access to the timer unit of the 
"82380. 

"End of Cycle indicator 

"Address and control strobe 

"PHI2 clock 

"Write/Read Signal" 

"No Connect 0" 

"No Connect 1" 

"Tied to ground, documentation only 
"Output enable, documentation only 


; "Input-CLKIN directly from oscillator 


; "Internal signal only, fed back to 


"PAL logic" 
"CLKIN signal fed to 82380 Timer Unit 
"CLKIN Inhibit signal 


; "Unused State Indicator Pin 


"Unused State Indicator Pin 


"ADS# sampled in PHI1 of 386DX T-State 
"RDY# sampled in PHI1 of S86DX T-State 


"Timer Unit Access, as provided by 
"external Address Decoder " 


50] 


if RESET then 000 
else if Valid_ADS & W_R then 001 
else 000; 


if RESET then 000 

else if Timer_Acc then 010 
else if !Timer_Acc then 000 
else 001; 


if RESET then 000 
else if CLK then 110 
else 010; 


if RESET then 000 
else if CLK then 111 
else 110; 
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State 111: °°  =if RESET then 000 


else if CLK then O11 
else lll; 

State 011: -if RESET then 000 
else if Valid_RDY then 000 
else 011; 

State 100: if RESET then 000° 
else 000; 

state 101: |. if RESET then 000 
else 000; 

EQUATIONS 


Q-O := CLKIN_IN ; “Latched incoming clock. This signal is used 
"internally to feed into the MUX-ing logic" — 


CLKIN_OUT := (INHIBIT & CLKIN_OUT & !RESET) 
+(!INHIBIT & Q_0 & !RESET) ; 


"Equation for CLKIN_OUT. This 
"feeds directly to the 82380 Timer Unit." 


END 


Page 1 


ABEL(tm) 3.10 - Document Generator 30-June 89 03:17 
PM 
82380 Timer Unit CLKIN 
INHIBIT signal PAL Solution 
Equations for Module Timer_82380_Fix 


Device Timer_Unit_Fix 
~ Reduced Equations: 
{INHIBIT s= (!CLK & !INHIBIT # CLK & SO # RESET # !81); 


{Sl :s= (RESET 
# INHIBIT & !S1 
# CLK & !INHIBIT & !~RDY & SO & Sl 
# !CLK & [Sl | 
# !S1l & !TMR 
# !SO0 & !S1); 


SO s= (RESET 
# INHIBIT & !S1 
# CLK & !INHIBIT & !~RDY & Sl 
# ‘CLK & !S0 
# ‘'INHIBIT & !S0 & Sl 
# SO & {Sl 
# !Sl & '!W_R 
# ~ADS & !51) ; 
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19-0 := ( !CLKIN_IN) ; 


‘CLKIN_OUT := (RESET # !CLKIN_OUT & INHIBIT # ‘INHIBIT & 


Page 2 
ABEL(tm) 3.10 = Document Generator 30-June 89 03:17 
PM 
82380 Timer Unit CLKIN 
‘INHIBIT signal PAL Solution 
Chip diagram for Module Timer_82380_Fix 


Device Timer_Unit_Fix 


290128-B7 


end of module Timer_82380_Fix 


CS, WR, RD and 
other internal signals 


4/\ sna xage¢ 


Internal Data Bus 


290128-B8 


Figure 1. Translation of 386DX Signals to Internal 82380 Timer Unit Signals 
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Figure 2. 82380 Timer Unit Write Cycle 
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-_[INHIBIT, $1, SO] 


RDY.CLK 


RDY.CLK 


= 
ad 
eo 
wn 
ie 
< 


TMR.CLK 


TMR.CLK 


(INHIBIT) 
290128-C1 


Figure 3. State Diagram for Inhibit Signal 


CLK2/CLK 
CIRCUIT | 


CLK2 CLK 


TIMER=PAL 82380 


CLK2 


CLKIN 


TMR CLKIN 


ADDR DEcopen 


| CLKIN OSC 


This solution does not support pipelined 82380 SLAVE Cycles. 


290128-C2 
NOTE: 


Figure 4. System with 82380 Timer Unit “Inhibit” Circuitry 
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Figure 5(a). Inhibited CLKIN in an 82380 Timer Unit and CLKIN Minimum HIGH Time 
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Figure 5(b). Inhibited CLKIN in an 82380 Timer Unit and CLKIN Minimum LOW Time 
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82380 DATA SHEET REVISION HISTORY 


Changes in this revision: | 

Figure 4-1: | Added details about IRQ3# and IRQ2#/IRQ1.5#. 

Section 5.2.1: Added note referring reader to Appendix D (System Notes). 
Table 13-2: | Changed Vic MIN to Voc — 0.8V. | 


Figure 13-1: Changed signal names to reflect accurate drive levels and measurement points for those sig- 
nals. 


Appendix D: Added this appendix to explain the restrictions on the CLKIN signal of the 82380 Timer Unit. 
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376™ HIGH PERFORMANCE 
32-BIT EMBEDDED PROCESSOR 


m@ Full 32-Bit Internal Architecture m Complete Intel Development Support 
— 8-, 16-, 32-Bit Data Types — C, PL/M, Assembler 
— 8 General Purpose 32-Bit Registers | — ICETM-376, In-Circuit Emulator 
— Extensive 32-Bit Instruction Set — iRMKT™ Real Time Kernel 
, : —iSDM™ Debug Monitor 
m High Performance 16-Bit Data Bus 
— 16 or 20 MHz CPU Clock — DOS Based Debug 
— Two-Clock Bus Cycles m Extensive Third-Party Support: 
— 16 Mbytes/Sec Bus Bandwidth — Languages: C, Pascal, FORTRAN, 
nal f . BASIC and ADA* 
m 16 Mbyte Physical Memory Size — Hosts: VMS*, UNIX*, MS-DOS", and 
m High Speed Numerics Support with the Others 
80387SX | | — Real-Time Kernels 
m Low System Cost with the ae m High Speed CHMOS IV Technology 
eee ee eer m Available in 100 Pin Plastic Quad Flat- 
m On-Chip Debugging Support Including Pack Package and 88-Pin Pin Grid Array 
Break Point Registers (See Packaging Outlines and Dimensions #231369) 


INTRODUCTION 


The 376 32-bit embedded processor is designed for high performance embedded systems. It provides the 
performance benefits of a highly pipelined 32-bit internal architecture with the low system cost associated with 
16-bit hardware systems. The 80376 processor is based on the 80386 and offers a high degree of compatibil- 
ity with the 80386. All 80386 32-bit programs not dependent on paging can be executed on the 80376 and all 
80376 programs can be executed on the 80386. All 32-bit 80386 language translators can be used for 
software development. With proper support software, any 80386-based computer can be used to develop and 
test 80376 programs. In addition, any 80386-based PC-AT* compatible computer can be used for hardware 
prototyping for designs based on the 80376 and its companion product the 82370. — 


Execution Unit 


| 52—Bit Registers 


| 64=Bit Barrel f 
Shifter 


_f| Multiply /Divide 


ea Bus Interface } § 
32—Bit Data Path f Unit 
] | Prefetch 
a 
Instruction A ” -Dasgetetes 


Queue 


MM 


U 


Protection 


Segment 
Registers 


Segment 


Translator 


Prefetch Unit 
240182-48 


80376 Microarchitecture 


Intel, IRMK, ICE, 376, 386, Intel386, iSDM, Intel1376 are trademarks of Intel Corp. 
*UNIX is a registered trademark of AT&T. 

ADA is a registered trademark of the U.S. Government, Ada Joint Program Office. 
PC-AT is a registered trademark of IBM Corporation. 

VMS is a trademark of Digital Equipment Corporation. 

MS-DOS is a trademark of MicroSoft Corporation. 
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1.0 PIN DESCRIPTION 


TOP VIEW 


 240182-52 


Figure 1.1. 80376 100-Pin Quad Flat-Pack Pin Out (Top View) 


Table 1.1. 100-Pin Plastic Quad Flat-Pack Pin Assignments 
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Top View 
(Component Side) 


K J H G F E D 


O Oo Oo Oo 
ADS READY HOLD DO 


O O O O 
BLE CLK2 NA HLDA 


oO 0 0 9O 
AIO. A133) AIS) AI7 


oO OO 90 9O 
All A12 A14 A16 


Bottom View 
(Pin Side) 
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0 0 0 
Dg HOLD READY 
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Figure 1.2. 80376 88-Pin Grid Array Pin Out 
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Table 1.2. 88-Pin Grid Array Pin Assignments 
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The following table lists a brief description of each pin on the 80376. The following definitions are used in 
these descriptions: 
~- The named signal is active LOW. 
Input signal. 
Output signal. 
~1/O — Input and Output signal. 
— No electrical connection. 


Name and Function 


CLK2 provides the fundamental timing for the 80376. For additional 
information see Clock in Section 4.1. 


RESET suspends any operation in progress and places the 80376 ina 
known reset state. See Interrupt eee in Section 4.1 for additional 
information. 


DATA BUS inputs data during memory, |/O and interrupt acknowledge 
read cycles and outputs data during memory and |/O write cycles. See 
Data Bus in Section 4.1 for additional information. 


ADDRESS BUS outputs physical memory or port I/O addresses. See 
Address Bus in Section 4.1 for additional information. | 


WRITE/READ is a bus cycle definition pin that distinguishes write 
cycles from read cycles. See Bus Cycle Definition Signals in Section 
4.1 for additional information. 


DATA/CONTROL is a bus cycle definition pin that distinguishes data 
cycles, either memory or |/O, from control cycles which are: interrupt 
acknowledge, halt, and instruction fetching. See Bus Cycle Definition 
Signals in Section 4.1 for additional information. 


MEMORY I/O is a bus cycle definition pin that distinguishes memory 
cycles from input/output cycles. See Bus Cycle Definition Signals in 
Section 4.1 for additional information. | 


BUS LOCK is a bus cycle definition pin that indicates that other 
system bus masters are denied access to the system bus while it is 
active. See Bus Cycle Definition Signals in Section 4.1 for additional 
information. 


ADDRESS STATUS indicates that a valid bus cycle definition and 
address (W/R, D/C, M/IO, BHE, BLE and Az3-A}) are being driven at 
the 80376 pins. See Bus Control Signals in Section 4.1 for additional 


information. 


NEXT ADDRESS is used to request address pipelining. See ues - 
Control Signals in Section 4.1 for additional information. 7 


BUS READY ierminates the bus cycle. See Bus Control Signals it in 
Section 4.1 for additional information. - 


BYTE ENABLES indicate which data bytes of the data bus take part in 
a bus cycle. See Address Bus in Section 4.1 for additional 
information. 


BUS HOLD REQUEST input allows another bus master to request 
control of the local bus. See Bus Arbitration Signals in Section 4.1 
for additional information. 
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BUS HOLD ACKNOWLEDGE output indicates that the 80376 has 
surrendered control of its local bus to another bus master. See Bus 
Arbitration Signals in Section 4.1 for additional information. 


INTERRUPT REQUEST is a maskable input that signals the 80376 to 
suspend execution of the current program and execute an interrupt 
acknowledge function. See Interrupt Signals in Section 4.1 for 
additional information. 


_ NON-MASKABLE INTERRUPT REQUEST is a non- mashable input 
that signals the 80376 to suspend execution of the current program 
and execute an interrupt acknowledge function. See Interrupt Signals. 
in Section 4.1 for additional. information. 


BUSY signals a busy condition from a processor extension. See 
Coprocessor Interface Signals in Section 4.1 for additional 
information. 


ERROR signals an error condition from a processor extension. See 
Coprocessor Interface Signals in Section 4.1 for additional 
information. 


PROCESSOR EXTENSION REQUEST indicates that the processor 
~ extension has data to be transferred by the 80376. See Coprocessor 
Interface Signals in Section 4.1 for additional information. 


FLOAT, when active, forces all bidirectional and output signals, 
including HLDA, to the float condition. FLOAT is not available on the 
PGA package. See Float for additional information. 


‘NO CONNECT should always remain unconnected. Connection of a 


N/C pin may cause the processor to malfunction or be incompatible 
with future steppings of the 80376. 


SYSTEM POWER provides the +5V nominal D.C. supply input. 
SYSTEM GROUND provides OV connection from which all inputs and 
outputs are measured. 3 


- 2.0 ARCHITECTURE OVERVIEW 


The 80376 supports the protection mechanisms 
needed by sophisticated multitasking embedded 
systems and real-time operating systems. The use 
of these protection mechanisms is completely op- 


tional. For embedded applications not needing pro- | 


tection, the 80376 can easily be configured to pro- 
vide a 16 Mbyte physical address space. 


Instruction pipelining, high bus bandwidth, and a 


very high performance ALU ensure short average © 


instruction execution times and high system 
throughput. The 80376 is capable of execution at 
sustained rates of 2.5-3.0 million instructions per 
second. 


The 80376 offers on-chip testability and debugging 
features. Four break point registers allow conditional 
or unconditional break point traps on code execution 
or data accesses for powerful debugging of even 
ROM based systems. Other testability features in- 


clude self-test and tri-stating of output buffers during | 


RESET. 


The Intel 80376 embedded processor consists of a 
central processing unit, a memory management unit 
and a bus interface. The central processing unit con- 


sists of the execution unit and instruction unit. The 


execution unit contains the eight 32-bit general reg- 
isters which are used for both address calculation 
and data operations and a 64-bit barrel shifter used 


_ to speed shift, rotate, multiply, and divide operations. 
_ The instruction unit decodes the instruction opcodes 


and stores them in the decoded instruction queue 
for immediate use by the execution unit. 


"The Memory Management Unit (MMU) consists of a 


segmentation and protection unit. Segmentation al- 
lows the managing of the logical address space by 


- providing an extra addressing component, one that 
_ allows easy code and data relocatability, and effi- 


cient sharing. 


The protection unit provides four levels of protection 
for isolating and protecting applications and the op- 


erating system from each other. The hardware en- 


forced protection allows the design of systems with 
a high degree of integrity and simplifies debugging. 


Finally, to facilitate high performance system hard- 
ware designs, the 80376 bus interface offers ad- 
dress pipelining and direct Byte Enable signals for 
each byte of the data bus. 
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2.1 Register Set 


The 80376 has twenty-nine registers as shown in Figure 2.1. These registers are grouped into the following six 
categories: 


EAX 
EBX 
ECX 
EDX 
GENERAL PURPOSE 
ESI REGISTERS 
EDI 
EBP 
ESP 
15 0 
cs 
ss 
DS 
SEGMENT 
ES REGISTERS 
FS 
GS 
31 0 
EFLAGS FLAGS AND 
"INSTRUCTION 
EIP POINTER 
31 0 
CONTROL 


VLLLLLEEEEEAA®™®—«({S 


fe Pee Ee ee ee 


240182-47 


~ INTEL RESERVED DO NOT USE 
240182-5 


Figure 2.1. 80376 Base Architecture Registers 
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General Registers: The eight 32-bit general pur-— 


pose registers are used to contain arithmetic and 
logical operands. Four of these (EAX, EBX, ECX and 
EDX) can be used either in their entirety as 32-bit 
registers, as_16-bit registers, or split into pairs of 
separate 8-bit registers. 


Segment Registers: Six 16-bit special purpose reg- 
isters select, at any given time, the segments of 
memory that are immediately addressable for code, 
stack, and data. 


Flags and Instruction Pointer Registers: These 
two 32-bit special purpose registers in Figure 2.1 
record or control certain aspects of the 80376 proc- 
essor state. The EFLAGS register includes status 
and control bits that are used to reflect’ the outcome 
of many instructions and modify the semantics of 
some instructions. The Instruction Pointer, called 
EIP, is 32 bits wide. The Instruction Pointer controls 
instruction fetching and the processor automatically 
increments it after executing an instruction. 


Control Register: The 32-bit control register, CRO, 


is used to control Coprocessor Emulation. 


SPECIAL FIELDS: 


1/O PRIVILEGE LEVEL 


NESTED TASK 
17 16 15 


376 EMBEDDED PROCESSOR 
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System Address Registers: These four special 


registers reference the tables or segments support- 


ed by the 80376/80386 protection model. These ta- 
bles or segments are: | 


GDTR (Global Descriptor Table Register), 
IDTR (Interrupt Descriptor Table Register), 
LDTR (Local Descriptor Table Register), 
TR (Task State Segment Register). 


Debug Registers: The six programmer accessible 
debug registers provide on-chip support for debug- 
ging. The use of the debug registers is described in 
Section 2.11 Debugging Support. 


EFLAGS REGISTER 


The flag Register is a 32-bit register named 
EFLAGS. The defined bits and bit fields within 
EFLAGS, shown in Figure 2.2, control certain opera- 
tions and indicate the status of the 80376 processor. 
The function of the flag bits is given in Table 2.1. 


STATUS FLAGS: 


OVERFLOW 
SIGN 

ZERO 

AUX CARRY 
PARITY 


CARRY 
1110 9 14 


ZTE FEET ETE “JEFLAGS | 


CONTROL FLAGS 


TRAP 
INTERRUPT 
DIRECTION 
RESUME 
240182-3 


MONITOR COPROCESSOR 
EMULATE COPROCESSOR 
TASK SWITCHED 


— INTEL RESERVED DO NOT USE 


240182-5 


240182-4 


Figure 2.2. Status and Control Register Bit Functions 
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Table 2.1. Flag Definitions 


Carry Flag—Set on high-order bit carry or borrow; cleared otherwise. 


2 Le Parity Flag—Set if low-order 8 bits of result contain an even number 


of 1-bits; cleared otherwise. 


4 Auxiliary Carry Flag—Set on carry from or borrow to the low order 


four bits of AL; cleared otherwise. 
Zero Flag—Set if result is zero; cleared otherwise. 


Sign Flag—Set equal to high-order bit of result (0 if positive, 1 if 
negative). 


next instruction executes. TF is cleared by the single step interrupt. 


Interrupt-Enable Flag—When set, external interrupts signaled on the 
INTR pin will cause the CPU to transfer control to an interrupt vector 


specified location. 


Direction Flag—Causes string instructions to auto-increment (default) 
the appropriate index registers when cleared. Setting DF causes auto- 


a eae 
a ioe Single Step Flag—Once set, a single step interrupt occurs after the 


decrement. 


Overflow Flag—Set if the operation resulted in a carry/borrow into 
the sign bit (high-order bit) of the result but did not result in a 
carry/borrow out of the high-order bit or vice-versa. : 


CF 
PF 
AF 
ZF 
SF 
TF 
AF 
DF 
IOPL I/O Privilege Level—indicates the maximum CPL permitted to 
execute |/O instructions without generating an exception 13 fault or 
| consulting the I/O permission bit map. It also indicates the maximum 
, CPL value allowing alteration of the IF bit. 


Nested Task—Indicates that the execution of the current task is 
nested within another task (see Task Switching). 


Resume Flag—Used in conjunction with debug register breakpoints. It 
is checked at instruction boundaries before breakpoint processing. If 
set, any debug fault is ignored on the next instruction. It is reset at the 
successful completion of any instruction except IRET, POPF, and 
those instructions causing task switches. | 


12,13 


6 


Bit Position | Name | 


10 
11 
14 
1 


CONTROL REGISTER 


The 80376 has a 32-bit control register called CRO that is used to control coprocessor emulation. This register 
is shown in Figures, 2.1 and 2.2. The defined CRO bits are described in Table 2.2. Bits 0, 4 and 31 of CRO have 
fixed values in the 80376. These values cannot be changed. Programs that load CRO should always load bits 
0, 4 and 31 with values previously there to be compatible with the 80386. | 


Table 2.2. CRO Definitions 


BitPosition | Name | == ——C‘Function = 

1 Monitor Coprocessor Extension—Allows WAIT instructions to cause 
a processor extension not present exception (number 7). 

2 Emulate Processor Extension—When set, this bit causes a 
processor extension not present exception (number 7) on ESC 
instructions to allow processor extension emulation. 

| 


Task Switched—When set, this bit indicates the next instruction using 
a processor extension will cause exception 7, allowing software to test | 
whether the current processor extension context belongs to the 
current task (see Task Switching). 
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2.2 Instruction Set 


The instruction set is divided into nine e categories of 
operations: 


Data Transfer 

Arithmetic 

Shift/Rotate 

String Manipulation 

Bit Manipulation 

Control Transfer 

High Level Language Support 
Operating System Support 
Processor Control 


These 80376 processor instructions are listed in Ta- 
ble 8.1 80376 Instruction Set and Clock Count 
Summary. 


All 80376 processor instructions operate on either 0, 
1, 2 or 3 operands; an operand resides ina register, 
in the instruction itself, or in memory. Most zero op- 
erand instructions (e.g. CLI, STI) take only one byte. 
One operand instructions generally are two bytes 
long. The average instruction is 3.2 bytes long. 
Since the 80376 has a 16-byte prefetch instruction 
queue an average of 5 instructions can be pre- 
fetched. The use of two operands permits the follow- 
ing types of common instructions: » 


Register to Register 
Memory to Register 
Immediate to Register 
Memory to Memory 
Register to Memory 
Immediate to Memory — 


The operands are either 8-, 16- or 32-bit long. 
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2.3 Memory Organization 


Memory on the 80376 is divided into 8-bit quantities 
(bytes), 16-bit quantities (words), and 32-bit quanti- 
ties (dwords). Words are stored in two consecutive 
bytes in memory with the low-order byte at the low- 
est address. Dwords are stored in four consecutive 
bytes in memory with the low-order byte at the low- 
est address. The address of a word or Dword is the 
byte address of the low-order byte. For maximum 
performance word and dword values should be at 


_ even physical addresses. | 


In addition to these basic data types the 80376 proc- 
essor supports segments. Memory can be divided 
up into one or more variable length segments, which 
can be shared between programs. 


ADDRESS SPACES 
The 80376 has three types of address spaces: 


logical, linear, and physical. A logical address 
(also known as a virtual address) consists of a se- 
lector and an offset. A selector is the contents of a 
segment register. An offset is formed by summing all 


of the addressing components (BASE, INDEX, and 


DISPLACEMENT), discussed in Section 2.4 


Addressing Modes, into an effective address. 


' Every selector has a logical base address associat- 


ed with it that can be up to 32 bits in length. This 32- 
bit logical base address is added to either a 32-bit 
offset address or a 16-bit offset address (by using 
the address length prefix )to form a final 32-bit 
linear address. This final linear address is then trun- 
cated so that only the lower 24 bits of this address 
are used to address the 16 Mbytes physical memory 
address space. The logical base address is stored 
in one of two operating system tables (i.e. the Local 


_ Descriptor Table or Global Descriptor Table). 


_ Figure 2.3 shows the relationship between the vari- 


ous address spaces. 
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OFFFFFFH 


Descriptor 
Table BHE #, 


(GDT or LDT) BLE #, 
A23~A1 
Limit/Access 


240182-6 


Figure 2.3. Address Translation 


\ 


SEGMENT REGISTER USAGE 


The main data structure used to organize memory is 
the segment. On the 80376, segments are variable 
sized blocks of linear addresses which have certain 
attributes associated with them. There are two main 
types of segments, code and data. The simplest use 
of segments is to have one code and data segment. 
Each segment is 16 Mbytes in size overlapping each 
other. This allows code and data to be directly ad- 
dressed by the same offset. 


In order to provide compact instruction encoding 
and increase processor performance, instructions 
do not need to explicitly specify which segment reg- 


ister is used. The segment register is automatically 
chosen according to the rules of Table 2.3 (Segment 
Register Selection Rules). In general, data refer- 
ences use the selector contained in the DS register, 
stack references use the SS register and instruction 
fetches use the CS register. The contents of the In- 
struction Pointer provide the offset. Special segment 
override prefixes allow the explicit use of a given 
segment register, and override the implicit rules list- 
ed in Table 2.3. The override prefixes also allow the 
use of the ES, FS and GS segment registers. 


There are no restrictions regarding the overlapping 
of the base addresses of any segments. Thus, all 6 
segments could have the base address set to zero. 
Further details of segmentation are discussed in 
Section 3.0 Architecture. 
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Destination of STOS, 
MOVS, REP STOS, 
REP MOVS Instructions. 
(DI is Base Register) — 


Other Data References, 
with Effective Address. 
Using Base Register of: . 

[EAX]  - 

[EBX] 

[ECX] 

[EDX] 

[ESI] 

[EDI] 

[EBP] 

[ESP] 


2.4 Addressing Modes 


“The 80376 provides a total of 8 addressing modes 

for instructions to specify operands. The addressing 
modes are optimized to allow the efficient execution 
of high level languages such as C and FORTRAN, 
and they cover the vast majority of data references 
needed by high-level languages. | oe 


Two of the addressing modes provide for instruc- 
tions that operate on register or immediate oper- 
ands: : 


-Register Operand Mode: The operand is located in 
one of the 8-, 16- or 32-bit general registers. 


Immediate Operand Mode: The operand is includ- 
ed in the instruction as part of the opcode. 


The remaining 6 modes provide a mechanism for 
specifying the effective address of an operand. The 
linear address consists of two components: the seg- 


- Segment Override 
_ Prefixes Possible 


CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 
CS, SS, ES, FS, GS 


ment base address and an effective address. The 
effective address is calculated by summing any 
combination of the following three address elements 
(see Figure 2.3): | 


DISPLACEMENT: an 8-, 16- or 32-bit immediate val- 
ue following the instruction. — | 


BASE: The contents of any general purpose regis- 
ter. The base registers are generally used by compil- 
ers to point to the start of the local variable area. 
Note that if the Address Length Prefix is used, only 
BX and BP can be used as a BASE register. 


INDEX: The contents of any general purpose regis- 
ter except for ESP. The index registers are used to 
access the elements of an array, or a string of char- 
acters. The index register’s value can be multiplied 
by a scale factor, either 1, 2, 4 or 8. The scaled index 
is especially useful for accessing arrays or struc- 
tures. Note that if the Address Length Prefix is 
used, no Scaling is available and only the registers 
SI and DI can be used to INDEX. 
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Combinations of these 3 components make up the 6 
additional addressing modes. There is no perform- 
ance penalty for using any of these addressing com- 
binations, since the effective address calculation is 
pipelined with the execution of other instructions. 
The one exception is the simultaneous use of BASE 
and INDEX components which requires one addi- 
tional clock. 


As shown in Figure 2.4, the effective address (EA) of 
an operand is calculated according to the following 
formula: 


DISPLACEMENT 


1. Direct Mode: The operand’s offset is contained 
as part of the instruction as an 8-, 16- or 32-bit 
DISPLACEMENT. 


CTOR : 
SELECTOR 


w) 


EFFECTIVE 


ADDRESS 


LINEAR 


DESCRIPTOR REGISTERS 


376 EMBEDDED PROCESSOR 


 —~ ADDRESS 
(+) TaRceT ADDRESS 


\ 


PRELIMINARY 


. Register Indirect Mode: A BASE register con- 


tains the address of the operand. 


. Based Mode: A BASE register’s contents is add- 


ed to a DISPLACEMENT to form the operand’s 
offset. 


. Scaled Index Mode: An INDEX register’s con- 


tents is multiplied by a SCALING factor which is 


added to a DISPLACEMENT to form the oper- 


and’s offset. 


. Based Scaled Index Mode: The contents of an 


INDEX register is multiplied by a SCALING factor 
and the result is added to the contents of a BASE 
register to obtain the operand’s offset. 


. Based Scaled Index Mode with Displacement: 


The contents of an INDEX register are multiplied 
by a SCALING factor, and the result is added to 
the contents of a BASE register and a DISPLACE- 
MENT to form the operand’s offset. 


INDEX REGISTER 
SCALE 
1,2,4, OR 8 


DISPLACEMENT 
(IN INSTRUCTION) 


SEGMENT 
LIMIT 


SELECTED 
SEGMENT 


SEGMENT BASE ADDRESS 


240182-7 


Figure 2.4. Addressing Mode Calculations 
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GENERATING 16-BIT ADDRESSES . 


The 80376 executes code with a default length for 
operands and addresses of 32 bits. The 80376 is 
also able to execute operands and addresses of 16 
bits. This is specified through the use of override 
prefixes. Two prefixes, the Operand Length Prefix 
and the Address Length Prefix, override the de- 
fault 32-bit length on an individual instruction basis. 
These prefixes are automatically added by assem- 


blers. The Operand Length and Address Length Pre- 
fixes can be applied separately or in combination to 
any instruction. 


The 80376 normally executes 32-bit code and uses 
either 8- or 32-bit displacements, and any register 
can be used as based or index registers. When exe- 
cuting 16-bit code (by prefix overrides), the displace- 
ments are either 8 or 16 bits, and the base and index 
register conform to the 16-bit model. Table 2.4 illus- 


trates the differences. — 


_ Table 2.4. BASE and INDEX Registers for 16- and 32-Bit Addresses 


| ss} 16-Bit Addressing | _32-Bit Addressing 
_| BASE REGISTER | BX, BP Any 32-Bit GP Register 


| INDEX REGISTER | SI, DI | Any 32-Bit GP Register 
| bo, | | except ESP 
| SCALEFACTOR | None  ——|_ 1,2, 4,8 | 
DISPLACMENT _| 0,8, 16 Bits 


0, 8, 32 Bits 


2.5 Data Types 
The 80376 supports all of the data types commonly used in high level languages: 
Bit: - Asingle bit quantity. | 
Bit Field: | A group of up to 32 contiguous bits, which spans a maximum of four 
bytes. | 
Bit String: _ Aset of contiguous bits, on the 80376 bit Strings can be up to 16 Mbits 
, long. 
Byte: 7 A signed 8-bit quantity. 
Unsigned Byte: An unsigned 8-bit quantity. 
Integer (Word): A signed 16-bit quantity. 


Long Integer (Double Word): 
representation. 
Unsigned Integer (Word): 


A signed 32-bit quantity. All operations assume a 2’s complement 


Unsigned Long Integer 


An unsigned 16-bit quantity. 


(Double Word): An unsigned 32-bit quantity. 
Signed Quad Word: A signed 64-bit quantity. 
Unsigned Quad Word: An unsigned 64-bit quantity. 
Pointer: 


Long Pointer: 


Char: 
String: 


BCD: 
Packed BCD: 


A 16- or 32-bit offset only quantity which indirectly references another | 
memory location. 'o ; 


A full pointer which consists of a 16-bit segment selector and either a 


- 16- or 32-bit offset. 


A byte representation of an ASCII Alphanumeric or control character. 


A contiguous sequence of bytes, words or dwords. A string may 
contain between 1 byte and 16 Mbytes. 


A byte (unpacked) representation of decimal digits 0-9. 
A byte (packed) representation of two decimal digits O—9 storing one 


digit in each nibble. 
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When the 80376 is coupled with a numerics Coprocessor such as the 80387SxX then the following 
common Floating Point types are supported. 


Floating Point: A signed 32-, 64- or 80-bit real number representation. Floating point 
numbers are supported by the 80387SX numerics COprocessor. | 


Figure 2.5 illustrates the data types supported by the 80376 processor and the 80387SX coprocessor. 


7 0 
SIGNED 
BYTE 


J 
a BIT I j 
MAGNITUDE 


7 0 


Loa 
MAGNITUDE 


UNSIGNED 
BYTE 


+1 0 
"1514 87 0 


WORD 


SIGN BIT-EEMSB_ 
MAGNITUDE 


UNSIGNED 
WORD 


MAGNITUDE 


“iets. 


A i 


SIGNED DOUBLE 
WORD 


SIGN BIT— (cMSBO 


+N +1 ) 
7 0 7 07 ) 
BINARY 


DECIMAL gp BCD BCD 
(BCD) piciTN DIGIT 1 DIGIT O 


+N +1 0 
7 0 7 07 0 
ASCII ASCII 
CHARACTER, CHARACTER) 


ASCII 
CHARACTER, 


PACKED 
BCD 


MOST 
SIGNIFICANT DIGIT 


LEAST 
SIGNIFICANT DIGIT 


+1 0 
0 7/15 07/15 O 


+N 
7/15 
BYTE 
STRING 


+2 GIGABITS = cares 


iT 


BITO 


BIT 
STRING 


MAGNITUDE 


UNSIGNED DOUBLE 
WORD 


so7 +6 +5 +5 
48 47 

SIGNED QUAD 
WORD 


SIGN BIT (eMSBO 


MAGNITUDE 


FLOATING 
POINT* 


+3 +2 +1 0 
31 Q 
REE ee eel ee es ee ee 


MAGNITUDE 


3231 


tro 


31 
SHORT 


+3 +2 +1 0 
0 


POINTER 
OFFSET 


+5 +4 +3 +2 +1 (8) 
47 _ 0 


OFFSET 
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Figure 2.5. 80376 Supported Data Types 
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2.6 1/O Space 


The 80376 has two distinct physical address 
spaces: physical memory and I/O. Generally, pe- 
ripherals are placed in I/O space .although the 
80376 also supports memory-mapped peripherals. 
The I/O space consists of 64 Kbytes which can be 
divided into 64K 8-bit ports, 32K 16-bit ports, or any 
combination of ports which add to no more than 64 
Kbytes. The M/IO pin acts as an additional address 
line, thus allowing the system designer to easily de- 
termine which address space the processor is ac- 
cessing. Note that the I/O address refers to a physi- 
cal address. 


The I/O ports are accessed by the IN and OUT in- 
structions, with the port address supplied as an im- 
mediate 8-bit constant in the instruction or in the DX 
register. All 8-bit and 16-bit port addresses are zero 
extended on the upper address lines. The !/O in- 
structions cause the M/IO pin to be driven LOW. I/O 


‘port addresses OOF8H through OOFFH are reserved _— 


for use by Intel. 


2.7 Interrupts and Exceptions 


Interrupts and exceptions alter the normal program 
flow in order to handle external events, report errors 
or exceptional conditons. The difference between in- 
_ terrupts and exceptions is that interrupts are used to 
-- fiandle asynchronous external events while excep- 
_. tions handle instruction faults. Although a program 
can generate a software interrupt via an INT N in- 
struction, the processor treats software interrupts as 
exceptions. 


Hardware interrupts occur as the result of an exter- 
‘nal event and are classified into two types: maskable 
or non-maskable. Interrupts are serviced after the 
execution of the current instruction. After the inter- 
rupt handler is finished servicing the interrupt, exe- 
cution proceeds with the instruction immediately 
after the interrupted instruction. 


Exceptions are classified as faults, arene, or aborts 


depending on the way they are reported, and wheth- 
er or not restart of the instruction causing the excep- 
tion is suported. Faults are exceptions that are de- 
tected and serviced before the execution of the 
faulting instruction. Traps are exceptions that are 
reported immediately after the execution of the in- 
struction which caused the problem. Aborts are ex- 
ceptions which do not permit the precise location of 
the instruction causing the exception to be deter- 
mined. Thus, when an interrupt service routine has 
been completed, execution proceeds from the in- 
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‘ struction immediately following the interrupted in- 
struction. On the other hand the return address from 


an exception/fault routine will always point at the 
instruction causing the exception and include any 
leading instruction prefixes. Table 2.5 summarizes 
the possible interrupts for the 80376 and shows 
where the return address points to. 


The 80376 has the ability to handle up to 256 differ- 
ent interrupts/exceptions. In order to service the in- 
terrupts, a table with up to 256 interrupt vectors 
must be defined. The interrupt vectors are simply 
pointers to the appropriate interrupt service routine. 
The interrupt vectors are 8-byte quantities, which are 
put in an Interrupt Descriptor Table. Of the 256 pos- 
sible interrupts, 32 are reserved for use by Intel and 
the remaining 224 are free to be used by the system 
designer. 


INTERRUPT PROCESSING 


When an interrupt occurs the following actions hap- 
pen. First, the current program address and the 
Flags are saved on the stack to allow resumption of 
the interrupted program. Next, an 8-bit vector is sup- 
plied to the 80376 which identifies the appropriate 
entry in the interrupt table. The table contains either 
an Interrupt Gate, a Trap Gate or a Task Gate that 
will point to an interrupt procedure or task. The user 
supplied interrupt service routine is executed. Final- 
ly, when an IRET instruction is executed the old 
processor state is restored and program execution 
resumes at the appropriate instruction. 


~The 8-bit interrupt vector is supplied to the 80376 in 


several different ways: exceptions supply the inter- 
rupt vector internally; software INT instructions con- 
tain or imply the vector; maskable hardware inter- 
rupts supply the 8-bit vector via the interrupt ac- 
knowledge bus sequence. Non-Maskable hardware 
interrupts are assigned to interrupt vector 2. 


Maskable Interrupt 


Maskable interrupts are the most common way to 
respond to asynchronous external hardware events. | 
A hardware interrupt occurs when the INTR is pulled 
HIGH and the Interrupt Flag bit (IF) is enabled. The 
processor only responds to interrupts between in- 
structions (string instructions have an “interrupt win- 
dow” between memory moves which allows inter- 
rupts during long string moves). When an interrupt 
occurs the processor reads an 8-bit vector supplied 
by the hardware which identifies the source of the 
interrupt (one of 224 user defined interrupts). 
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Table 2.5. interrupt Vector Assignments 


Return Address 


Debug Exception 
NMI Interrupt 
One-Byte Interrupt 
Interrupt on Overflow 
Array Bounds Check 
Invalid OP-Code 
Device Not Available 
Double Fault 


Coprocessor Segment Overrun 
Invalid TSS 

Segment Not Present 

Stack Fault 


Interrupt 
Number 


Instruction Which 
Can Cause 
Exception 


Generate an Exception 
re 
ae 


Points to 
Faulting 
Instruction 


*Some debug exceptions may report both traps on the previous instruction, and faults on the next instruction. 


Interrupts through Interrupt Gates automatically re- 
set IF, disabling INTR requests. Interrupts through 
Trap Gates leave the state of the IF bit unchanged. 
Interrupts through a Task Gate change the IF bit ac- 
cording to the image of the EFLAGs register in the 
task’s Task State Segment (TSS). When an IRET 
instruction is executed, the original state of the IF bit 
is restored. 


Non-Maskable Interrupt 


Non-maskable interrupts provide a method of servic- 
ing very high priority interrupts. When the NMI input 
is pulled HIGH it causes an interrupt with an internal- 
ly supplied vector value of 2. Unlike a normal hard- 
ware interrupt no interrupt acknowledgement se- 
quence is performed for an NMI. 


While executing the NMI servicing procedure, the 
80376 will not service any further NMI request, or 
INT requests, until an interrupt return (IRET) instruc- 


tion is executed or the processor is reset. If NMI 
occurs while currently servicing an NMI, its presence 
will be saved for servicing after executing the first 
IRET instruction. The disabling of INTR requests de- 
pends on the gate in IDT location 2. 


Software Interrupts 


A third type of interrupt/exception for the 80376 is 
the software interrupt. An INT n instruction. causes 
the processor to execute the interrupt service rou- 
tine pointed to by the nth vector in the interrupt table. 


A special case of the two byte software interrupt 
INT nis the one byte INT 3, or breakpoint interrupt. 
By inserting this one byte instruction in a program, 
the user can set breakpoints in his program as a 
debugging tool. | 


5-1233 


intel 


A final type of software interrupt, is the single step 
interrupt. It is discussed in Single-Step Trap ipage 
eae 


INTERRUPT AND EXCEPTION PRIORITIES 


Interrupts are externally-generated events. Maska- 
ble Interrupts (on the INTR input) and Non-Maskable 
Interrupts (on the NMI input) are recognized at in- 
struction boundaries. When NMI and maskable 
INTR are both recognized at the same instruction 
boundary, the 80376 invokes the NMI service rou- 
tine first. If, after the NMI service routine has been 
invoked, maskable interrupts are still enabled, then 
the 80376 will invoke the appropriate interrupt serv- 
ice routine. 


As the 80376 executes instructions, it follows a con- 


sistent cycle in checking for exceptions, as shown in - 


Table 2.6. This cycle is repeated as each instruction 


is executed, and occurs in parallel with instruction 


decoding and execution. 


INSTRUCTION RESTART 


The 80376 fully supports restarting all instructions 
_ after faults. If an exception is detected in the instruc- 


tion to be executed (exception categories 4 through © 


9 in Table 2.6), the 80376 device invokes the appro- 
. priate exception service routine. The 80376 is in a 
~ gtate that permits restart of the instruction. 


376 EMBEDDED PROCESSOR 


PRELIMINARY 


DOUBLE FAULT 


A Double fault (exception 8) results when the proc- 
essor attempts to invoke an exception service rou- 
tine for the segment exceptions (10, 11, 12 or 13), 


~ but in the process of doing so, detects an exception. 


2.8 Reset and Initialization 


When the processor is Reset the registers have the 
values shown in Table 2.7. The 80376 will then start 
executing instructions near the top of physical mem- 
ory, at location OFFFFFOH. A short JMP should be 
executed within the segment defined for power-up 
(see Table 2.7). The GDT should then be initialized 


for a start-up data and code segment followed by a 


far JMP that will load the segment descriptor cache 
with the new descriptor values. The IDT table, after 
reset, is located at physical address OH, with a limit 
of 256 entries. _ 


RESET forces the 80376 to terminate all execution 
and local bus activity. No instruction execution or 
bus activity will occur as long as Reset is active. 
Between 350 and 450 CLK2 periods after Reset be- 
comes inactive, the 80376 will start executing in- 
structions at the top of physical memory. 


_ Table 2.6. Sequence of Exception Checking 


Consider the case of the 80376 having just completed an instruction. It then performs the following checks 
.before reaching the point where the next instruction is completed: | | 


. Check for Exception 1 Traps from the instruction just ee nal step via Trap Flag, or Data 


' Breakpoints set in the Debug Registers). 
. Check for external NMI and INTR. 


. Check for Exception 1 Faults in the next instruction (Instruction Execution Breakpoint set in the 
Debug Registers for the next instruction). 


. Check for Segmentation Faults that prevented fetching the entire next instruction (exceptions 11 or 


13). 


. Check for Faults decoding the next instruction (exception 6 if illegal opcode; or exception 13 if 
‘instruction is longer than 15 bytes, or privilege violation (i.e. not at IOPL or at CPL = 0). 


_ If WAIT opcode, check if TS = 1 and MP = 1 (exception 7 if both are 1). 
lf ESCape opcode for numeric coprocessor, check if EM = 1 or TS = 1 (exception 7 if either are 4). 


. If WAIT opcode or ESCape opcode for numeric coprocessor, check ERROR input signal (exception 
16 if ERROR input is asserted). 


. Check for Segmentation Faults that prevent transferring the entire memory quantity (exceptions 11, 
12, 13). 
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Table 2.7. Register Values after Reset 


000 ste 


Undefined (Note 7) 


NOTES: 

1. EFLAG Register. The upper 14 bits of the EFLAGS register are undefined, all defined 
flag bits are zero. 

2. CRO: The defined 4 bits in the CRO is equal to 1H. . 
3. The Code Segment Register (CS) will have its Base Address set to OFFFFOOOOH and 
Limit set to OFFFFH. 

4. The Data and Extra Segment Registers (DS and ES) will have their Base Address set 
to OOOOOO000H and Limit set to OFFFFH. 

5. If self-test is selected, the EAX should contain a 0 value. If a value of 0 is not found 
the self-test has detected a flaw in the part. 

6. EDX register always holds component and stepping identifier. 

7. All unidentified bits are Intel Reserved and should not be used. 


2.9 Initialization 


Because the 80376 processor starts executing in protected mode, certain precautions need be taken during 
initialization. Before any far jumps can take place the GDT and/or LDT tables need to be setup and their 
respective registers loaded. Before interrupts can be initialized the IDT table must be setup and the IDTR must 
be loaded. The example code is shown below: 


AE AE 2c 2 aE 2k 2k ak 2k 2s ak ik fe aie ak OK aie ak ik oi ik ik 2 oie 2 ak Ik 2k 2c aie 2k ik 2K 2 aK 2 2K ok 2 2k ak ok 2k ak ok 2k 2 2K ok OK 2K 2K ok oe 2k 2K ok OK 2K 2k Ok Ok ok 


; This is an example of Startup code to put either an 80376, 

; 80386SX or 80386 into flat mode. All of memory is treated as 

; Simple: linear RAM. There are no interrupt routines. The 

; Builder creates the GDTI-alias and IDI=-alias and places them, 

; by default, in GDT[1] and GDT[2]. Other entries in the GDT 

; are Specified in the Build file. After initialization it jumps 
; to a C startup routine. To use this template, change this jmp 

; address to that of your code, or make the label of your code 

; "c_startup". 


; This code was assembled and built using version 1.2 of the 
Intel RLL utilities and Intel 386ASM assembler. 


7 ok 2k This code was tested * KK 


286 3 kK KK a aac ae aka ae fe akc 2c ak akc ae aki ae 9k Fe eke 2c aka oe kee ok ake 2c fe ake ae ake fe ak 2k oe 9k 2c ak ake aie ak 2kc ok akc oie aft akc akc ae ake akc ak ak 
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NAME FLAT. 


PRELIMINARY 


; name of the object module 
EXTRN c_Startup:near ; this is the label jmped to after init 
pe_flag equ 1 | 
data_selc equ 20h ; assume code is GDI[S], data GDIT[4] 
INIT_CODE SEGMENT ER PUBLIC USES2 ; Segment base at Offffffsoh 


PUBLIC GDT_DESC 


gdt_desc dq ? 

PUBLIC START 

Starts: - 
cld ; clear direction flag | 
smsw bx ; check for processor (80376) at reset 

test bl,l ; use SMSW rather than MOV for Speed 

jnz pestart 

realstart ; is an 80386 and in real mode 
db 66h ; force the next operand into 32-bit mode. 
mov eax,offset gdt_desc ; move address of the GDT descriptor into eax 
xor ebx,ebx ; clear ebx 
mov bh,ah ; load 8 bits of address into bh 
move bl,al : load 8 bits of address into bl | 
db 67h : 
db 66h 7 | ; use the 32-bit form of LGDT to load 
lgdt cs:[ebx] ; the S2-bits of address into the GDIR 
smsw ax | 3 : ; go into protected mode (set PE bit) 
or al,pe_flag | | 
lmsw ax . a | 
jmp next : flush prefetch queue 

pestart: 


mov ebx,offset gdt_desc 
xor eax,eax 
mov ax,bx a Ha 
lgdt cs:[eax] 
xor ebx,ebx 
mov bl,data_selc 
mov ds,bx 
mov ssS,bx 
mov eS,bx 
mov fS,bx 
mov gS,bx 
“jmp pejump 

next ; 
xor 
mov 
mov 
mov 
mov 


lower portion of address only 


initialize data selectors 
GDT [3] 


we 


we 


ebx, ex 
bl,data_selc 
ds,bx 
ss,bx 
es,bx 
mov fs,bx 
mov gS,bx 
db 66h : 
pejump: 
jmp: 


we 


initialize data selectors 
@DT[3] 


we 


for the 80586, need to make a 32-bit jump 


far ptr c_startup > but the 80376 is already 32-bit. 
A 70h —_ : 
mp short start 
aie CODE ENDS 
END 


only if segment base is at oOffrfrfrffsoh 


5-1236 


intel 376 EMBEDDED PROCESSOR PRELIMINARY 


This code should be linked into your application for boot loadable code. The following build file illustrates how 
this is accomplished. 


FLAT; == build program id 


- SEGMENT 
*segments (dpl=0), -- Give all user segments a DPL of 0. 
_phantom_code_ (dpl=0), -- These two Segments are created by 
_phantom_data_ (dpl=0), -- the builder when the FLAT control is used. 
init_code (base=Offffff80h) ; -- Put startup code at the reset vector area. 
GATE 
g13 (entry=13, dpl=0, trap), -- trap gate disables interrupts 
i32 (entry=32, dpl=0, interrupt), -- interrupt gates doesn't 
TABLE . 
-- create GDT 
GDT (LOCATION = GDT_DESC, -- In a buffer starting at GDT_DESC, 
-- BLD386 places the GDT base and 
-~- GDT limit values. Buffer must be 
-- 6 bytes long. The base and limit 
-~- values are places in this buffer 
-- aS two bytes of limit plus 
-~- four bytes of base in the format 
-- required for use by the LGDT 
-- instruction. 
ENTRY = (3:_phantom_code_, . ~~ Explicitly place segment 
43;_phantom_data_, -- entries into the GDT. 
5 3:code32, 
6 sdata, 
7sinit_code) 
TASK 
MAIN_TASK 
DPL = 0, -~ Task privilege level is 0. 
DATA = DATA, 1 -- Points to a segment that 
-- indicates initial DS value. 
CODE = main, -- Entry point is main, which 
-- must be a public id. 
STACKS = (DATA), -- Segment id points to stack 
; -- segment. Sets the initial SS:ESP. 
NO INTENABLED, . -- Disable interrupts. 
PRESENT : -~- Present bit in TSS set to l. 
ys 
MEMORY . 
(RANGE = (EPROM = ROM(Offff8000h..Ofrfrfrfrrffrfh), 
DRAM = RAM(O..Offffh)), 
ALLOCATE = (EPROM = (MAIN_TASK) ) ) ; 
END 


asm386 flatsim.a38 debug 

- asm386 application.a38 debug 

bnd386 application.obj,flatsim.obj nolo debug oj (application. bnd) 
b1d386 application.bnd bf (flatsim.bld) bl flat 


Commands to assemble and build a boot-loadable application named “‘application.a38”. The initialization code 
is called ‘‘flatsim.a38”’, and build file is called ‘‘application.bld’’. 
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2.10 Self-Test 


The 80376 has the capability to perform a self-test. 
The self-test checks the function of all of the Control 
ROM and most of the non-random logic of the part. 
Approximately one-half of the 80376 can be tested 
during self-test. 


Self-Test is initiated on the 80376 when the RESET 
pin transitions from HIGH to LOW, and the BUSY pin 
is LOW. The self-test takes about 229 clocks, or ap- 
proximately 33 ms with a 16 MHz 80376 processor. 
At the completion of self-test the processor per- 
forms reset and begins normal operation. The part 
has successfully passed self-test if the contents of 
the EAX register is zero. If the EAX register is not 


zero then the self-test has detected a flaw in the - 


part. If self-test is not selected after reset, EAX may 
be non-zero after reset. 


DEBUG REGISTERS 


BREAKPOINT 0 DEBUG FAULT/TRAP 
BREAKPOINT 1 DEBUG FAULT/TRAP 
BREAKPOINT 2 DEBUG FAULT/TRAP - 
BREAKPOINT 3 DEBUG FAULT/TRAP 
| REGISTER ACCESS FAULT 
SINGLE=STEP DEBUG TRAP 
TASK SWITCH DEBUG TRAP —=— 
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2.11 Debugging Support | 


The 80376 provides several features which simplify 
the debugging process. The three palegonee: of on- 
chip debugging aids are: 


1. The code execution breakpoint pacods (och). 


2. The single-step capability provided by the TF bit 
in the flag register, and 


3. The code and data breakpoint capability provided 
by the Debug Registers DRO-3, DR6, and DR7. 


BREAKPOINT INSTRUCTION 


A single-byte software interrupt (Int 3) breakpoint in- 
struction is available for use by software debuggers. 
The breakpoint opcode is OCCh, and generates an 
exception 3 trap when executed. | 


DEBUG 
STATUS 
REGISTER 


WLLL | WLLL DRE 


Gi: GLOBAL BREAKPOINT ENABLE i 


Li: LOCAL BREAKPOINT ENABLE i. 


LOCAL EXACT BREAKPOINT MATCH 
_ GLOBAL EXACT BREAKPOINT MATCH 
GLOBAL DEBUG REGISTER ACCESS DETECT 


31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 


15 14 13 


panies 9 


BREAKPOINT 
CONTROL 


PUPICUCIcICICIcI aoe cooooeo00 “|DR7 


- INTEL RESERVED DO NOT USE 


LENi: BREAKPOINT LENGTH i 
RWi: MEMORY ACCESS QUALIFIER | 


240182-10 


240182-5 


Figure 2.6. Debug Registers 
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SINGLE-STEP TRAP 


If the single-step flag (TF, bit 8) in the EFLAG regis- 
ter is found to be set at the end of an instruction, a 
single-step exception occurs. The single-step ex- 
ception is auto vectored to exception number 1. 


The Debug Registers are an advanced debugging 
feature of the 80376. They allow data access break- 
points as well as code execution breakpoints. Since 
the breakpoints are indicated by on-chip registers, 
an instruction execution breakpoint can be placed in 
ROM code or in code shared by several tasks, nei- 
ther of which can be supported by the INT 3 break- 
point opcode. 


The 80376 contains six Debug Registers, consisting 
of four breakpoint address registers and two break- 
point control registers. Initially after reset, break- 
points are in the disabled state; therefore, no break- 
points will occur unless the debug registers are 
programmed. Breakpoints set up in the Debug 
Registers are auto-vectored to exception 1. 
Figure 2.6 shows the breakpoint status and control 
registers. , | 


48/32 BIT POINTER 


| | EFFECTIVE 
SELECTOR a oDnees 


ACCESS RIGHTS 
f LIMIT 
BASE ADDRESS 


SEGMENT 
DESCRIPTOR 
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3.0 ARCHITECTURE 


The Intel 80376 Embedded Processor has a physi- 
cal address space of 16 Mbytes (224 bytes) and al- 
lows the running of virtual memory programs of al- 
most unlimited size (16 Kbytes <x 16 Mbytes or 
256 Gbytes (238 bytes)). In addition the 80376 pro- 
vides a sophisticated memory management and a 
hardware-assisted protection mechanism. 


3.1 Addressing Mechanism 


The 80376 uses two components to form the logical 
address, a 16-bit selector which determines the lin- 
ear base address of a segment, and a 32-bit effec- 
tive address. The selector is used to specify an 
index into an operating system defined table (see 
Figure 3.1). The table contains the 32-bit base ad- 
dress of a given segment. The linear address is 
formed by adding the base address obtained from 
the table to the 32-bit effective address. This value 
is truncated to 24 bits to form the physical address, 
which is then placed on the address bus. 


SEGMENT LIMIT 


| 


16 MEGABYTES 


| 


SELECTED 
SEGMENT 
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Figure 3.1. Address Calculation 
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3.2 Segmentation 


Segmentation is one method of memory manage- 
ment and provides the basis for protection in the 
80376. Segments are used to encapsulate regions 
of memory which have common attributes. For ex- 
ample, all of the code of a given program could be 
contained in a segment, or an operating system ta- 
ble may reside in a segment. All information about 
each segment, is stored in an 8-byte data structure 
called a descriptor. All of the descriptors in a system 
are contained in tables recognized by hardware. 


TERMINOLOGY 


The following terms are used throughout the discus- 
sion of descriptors, privilege levels and protection: 


PL: Privilege Level—One of the four hierarchical 
privilege levels. Level 0 is the most privileged 
level and level 3 is the least privileged. 


RPL: Requestor Privilege Level—The privilege 
level. of the original supplier of the selector. 
RPL is determined by the least two significant 
bits of a selector. | 


DPL: Descriptor Privilege Level—This is the least - 


privileged level at which a task may access 
that descriptor (and the segment associated 
with that descriptor). Descriptor Privilege Lev- 
el is determined by bits 6:5 in the Access 
Right Byte of a descriptor. 


CPL: Current Privilege Level—The privilege level 
at which a task is currently executing, which 
equals the privilege level of the code seg- 
ment being executed. CPL can also be deter- 
mined by examining the lowest 2 bits of the 
CS register, except for conforming code seg- 
ments. 


EPL: Effective Privilege Level—The effective 
privilege level is the least privileged of the 
RPL and the DPL. EPL is the numerical maxi- 
mum of RPL and DPL. 


Task: One instance of the execution of a program. 
| ‘Tasks are also referred to as processes. 


DESCRIPTOR TABLES 


The descriptor tables define all of the segments - 


which are used in an 80376 system. There are three 
types of tables on the 80376 which hold descriptors: 
the Global Descriptor Table, Local Descriptor Table, 
and the Interrupt Decriptor Table. All of the tables 
are variable length memory arrays, they can range in 
size between 8 bytes and 64 Kbytes. Each table can 
hold up to 8192 8-byte descriptors. The upper 13 
bits of a selector are used as an index into the de- 
scriptor table. The tables have registers associated 
with them which hold the 32-bit linear base address, 
and the 16-bit limit of each table. 
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Each of the tables have a register associated with it: 
GDTR, LDTR and IDTR; see Figure 3.2. The LGDT, 
LLDT and LIDT instructions load the base and limit 
of the Global, Local and Interrupt Descriptor Tables 
into the appropriate register. The SGDT, SLDT and 
SIDT store these base and limit values. These are 
privileged instructions. 


15 0 


| LOT DESCR | | 
SELECTOR ee 


i] 
i] 
¢ 
i] 
] 
f L) 
LDT BASE i! 
LINEAR ADDRESS |} 

ba 32 4 

IDT LIMIT ¢ PROGRAM INVISIBLE — 
. , AUTOMATICALLY LOADED ¢ 


15 


eras | FROM LDT DESCRIPTOR 4 
LINEAR ADDRESS : 
31 Oo. 
15 0 
| GDT LIMIT 


GDT BASE 
LINEAR ADDRESS 
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Figure 3.2. Descriptor Table Registers 


Global Descriptor Table 


The Global Descriptor Table (GDT) contains de- 
scriptors which are possibly available to all of the 
tasks in a system. The GDT can contain any type of 
segment descriptor except for interrupt and trap de- 
scriptors. Every 80376 system contains a GDT. A 
simple 80376 system contains only 2 entries in the 
GDT; a code and a data descriptor. For maximum 
performance, descriptor tables should begin on 
even addresses. 


The first slot of the Global Descriptor Table corre- 
sponds to the null selector and is not used. The null 
selector defines a null pointer value. 


Local Descriptor Table 


LDTs contain descriptors which are associated with 
a given task. Generally, operating systems are de- 
signed so that each task has a separate LDT. The 
LDT may contain only code, data, stack, task gate, 
and call gate descriptors. LDTs provide a mecha- 
nism for isolating a given task’s code and data seg- 
ments from the rest of the operating system, while 
the GDT contains descriptors for segments which 
are common to all tasks. A segment cannot be ac- 
cessed by a task if its segment descriptor does not 
exist in either the current LDT or the GDT. This pro- 
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vides both isolation and protection for a task’s seg- 
ments, while still allowing global data to be shared 
among tasks. 


Unlike the 6-byte GDT or IDT registers which contain 
a base address and limit, the visible portion of the 
LDT register contains only a 16-bit selector. This se- 
lector refers to a Local Descriptor Table descriptor in 
the GDT (see Figure 2.1). 


INTERRUPT DESCRIPTOR TABLE 


The third table needed for 80376 systems is the In- 
terrupt Descriptor Table. The IDT contains the de- 
scriptors which point to the location of up to 256 
interrupt service routines. The IDT may contain only 
task gates, interrupt gates and trap gates. The IDT 
should be at least 256 bytes in size in order to hold 
the descriptors for the 32 Intel Reserved Interrupts. 
Every interrupt used by a system must have an entry 
in the IDT. The IDT entries are referenced by INT 
instructions, external interrupt vectors, and excep- 
tions. 


DESCRIPTORS 


The object to which the segment selector points to 
is called a descriptor. Descriptors are eight-byte 
quantities which contain attributes about a given 
region of linear address space. These attributes in- 
clude the 32-bit logical base address of the seg- 


31 


BASE Base Address of the segment 
LIMIT The length of the segment 
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ment, the 20-bit length and granularity of the seg- 
ment, the protection level, read, write or execute 
privileges, and the type of segment. All of the attri- 
bute information about a segment is contained in 12 
bits in the segment descriptor. Figure 3.3 shows the 
general format of a descriptor. All segments on the 
the 80376 have three attribute fields in common: the 
Present bit (P), the Descriptor Privilege Level bits 
(DPL) and the Segment bit (S). P=1 if the segment 
is loaded in physical memory, if P = O then any 
attempt to access the segment causes a not present 
exception (exception 11). The DPL is a two-bit field 
which specifies the protection level, 0-3, associated 
with a segment. 


The 80376 has two main categories of segments: 
system segments, and non-system segments (for 
code and data). The segment bit, S, determines if a 
given segment is a system segment, a code seg- 
ment or a data segment. If the S bit is 1 then the 
segment is either a code or data segment, if it is 0 
then the segment is a system segment. 


Note that although the 80376 is limited to a 


16-Mbyte Physical address space (224), its base ad- 
dress allows a segment to be placed anywhere in a 
4-Gbyte linear address space. When writing code for 
the 80376, users should keep code portability to an 
80386 processor (or other processors with a larger 
physical address space) in mind. A segment base 
address can be placed anywhere in this 4-Gbyte lin- 
ear address space, but a physical address will be 


O BYTE 
ADDRESS 


P . Present Bit 1 = Present 0O = Not Present 

DPL Descriptor Privilege Level 0-3 . 
S) Segment Descriptor: 0 = System Descriptor, 1 = Code or Data Descriptor 
TYPE Type of Segment 

A Accessed Bit 

G Granularity Bit 1 = Segment length is 4 Kbyte Granular 


0 = Segment length is byte granular 


0 Bit must be zero (0) for compatibility with future processors 


AVL Available field for user or OS 


_ Figure 3.3. Segment Descriptors 


SEGMENT BASE 15...0 SEGMENT LIMIT 15... 0 


ACCESS 
_ RIGHTS 
BYTE 


Granularity Bit 1 = Segment length is 4 Kbyte granular - 
0 = Segment length is byte granular 
Bit must be zero (0) for compatibility with future processors 


AVL Available field for user or OS 


Figure 3.4. Code and Data Descriptors 
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Table 3.1. Access Rights Byte Definition for Code and Data Descriptors 


Bit | ; 


Present (P) 


P = 1 Segment is mapped into physical memory. 


P =0 Nomapping to physical memory exits 


Descriptor Privilege 
Level (DPL) 
Segment 
Descriptor (S) 


Executable (E) 
|Expansion _ 
Direction (ED) 

Writable (W) 


mmMmmM|n®w 


D 
D 


Executable (E) 
Conforming (C) 


Om ss 


Readable (R) 


Accessed (A) 


Segment privilege attribute used in privilege tests. 


Code or Data (includes stacks) segment descriptor 
= 0 System Segment Descriptor or Gate Descriptor 


0 Descriptor type is data segment: ) lf | 
= 0 Expand up segment, offsets must be < limit. Data 
= 1 Expand down segment, offsets must be > limit. 
= 0 Data segment may not be written into. ~ 
Data segment may be written into. 


Segment 
(S = 1, 
E = 0) 


Descriptor type is code segment: ) if 
Code segment may only be executed when | 
CPL > DPL and CPL remains unchanged. 
Code segment may not be read. 

Code segment may be read. 


Code 
Segment 
(S=1, . 
E = 1) 


Segment has not been accessed. | 
Segment selector has been loaded into segment register 


or used by selector test instructions. 


_ generated that is a truncated version of this linear 
address. Truncation will be to the maximum number 
of address bits. It is recommended to place EPROM 
at the highest physical address and DRAM at the 
lowest physical addresses. 


Code and Data Descriptors (S= 1) 


Figure 3.4 shows the general format of a code and 
data descriptor and Table 3.1 illustrates how the bits 
in the Access Right Byte are interpreted. 


Code and data segments have several descriptor 
fields in common. The accessed bit, A, is set when- 
ever the processor accesses a descriptor. The gran- 
ularity bit, G, specifies if a segment length is 1-byte- 
granular or 4-Kbyte-granular. Base address bits 
31-24, which are normally found in 80386 descrip- 
tors, are not made externally available on the 80376. 
They do not affect the operation of the 80376. The 
A31—A24 field should be set to allow an 80386 to 
correctly execute with EPROM at the upper 4096 
Mbytes of physical memory. 


System Descriptor Formats (S = 0) — 


System segments describe information about oper- 
ating system tables, tasks, and gates. Figure 3.5 
shows the general format of system segment de- 


scriptors, and the various types of system segments. 


80376 system descriptors (which are the same as 
80386 descriptor types 2, 5, 9, B, C, E and F) contain 
a 32-bit logical base address and a 20-bit segment 
limit. 


Selector Fields 


_ A selector has three fields: Local or Global Descrip- 


tor Table Indicator (Tl), Descriptor Entry Index (In- 
dex), and Requestor ( the selector’s) Privilege Level 
(RPL) as shown in Figure 3.6. The TI bit selects ei- 
ther the Global Descriptor Table or the Local De- 
scriptor Table. The Index selects one of 8K descrip- 
tors in the appropriate descriptor table. The RPL bits 
allow high speed testing of the selector’s privilege 
attributes. - _ 


Segment Descriptor Cache 


In addition to the selector value, every segment reg- 
ister has a segment descriptor cache register asso- 
ciated with it. Whenever a segment register’s con- 
tents are changed, the 8-byte descriptor associated 
with that selector is automatically loaded (cached) 
on the chip. Once loaded, all references to that seg- 
ment use the cached descriptor information instead 
of reaccessing the descriptor. The contents of the 
descriptor cache are not visible to the programmer. 
Since descriptor caches only change when a seg- 
ment register is changed, programs which modify | 
the descriptor tables must reload the appropriate 
segment registers after changing a descriptor’s 
value. 
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BASE 
tol ofole 


ype Defines 
Invalid 
Reserved 
LDT 
Reserved 
Reserved 
Task Gate (80376/80386 Task) 
Reserved 
Reserved 


NOORWNM+0 4 


Type Defines 

8 invalid 

9 Available 80376/80386 TSS 
A Undefined (Intel Reserved) 
B Busy 80376/80386 TSS 

C 80376/80386 Call Gate 

D Undefined (intel Reserved) 
E 80376/80386 Interrupt Gate 
F 80376/80386 Trap Gate 


Figure 3.5. System Descriptors 


SELECTOR 


43210 


SEGMENT 
REGISTER J 


DESCRIPTOR 
TABLE 
(LDT) 


DESCRIPTOR 
NUMBER 


GLOBAL 
DESCRIPTOR 
TABLE 
(GDT) 
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Figure 3.6. Example Descriptor Selection 


3.3 Protection 


The 80376 offers extensive protection features. 
These protection features are particularly useful in. 


sophisticated embedded applications which use 
multitasking real-time operating systems. For sim- 


pler embedded applications these protection capa- - 


bilities can be easily bypassed by making all applica- 
tions run at privilege level (PL) 0. 


RULES OF PRIVILEGE 
The 80376 controls access to both data and proce- 


dures between levels of a task, according to the fol- 
lowing rules. 


—Data stored in a segment with privilege level p_ 
can be accessed only by code executing at a 
privilege level at least as privileged as p. 


—A code segment/procedure with privilege level p 
can only be called by a task executing at the 
same or a lesser privilege level than p. 


PRIVILEGE LEVELS 


At any point in time, a task on the 80376 always 
executes at one of the four privilege levels. The Cur- 


rent Privilege Level (CPL) specifies what the task’s 


privilege level is. A task’s CPL may only be changed 
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by control transfers through gate descriptors to a 
code segment with a different privilege level. Thus, 
an application program running at PL=3 may call an 
operating system routine at PL=1 (via a gate) which 
would cause the task’s CPL to be set to 1 until the 
Operating system routine was finished. 


Selector Privilege (RPL) 


The privilege level of a selector is specified by the 
RPL field. The selector’s RPL is only used to estab- 
lish a less trusted privilege level than the current 
privilege level of the task for the use of a segment. 
This level is called the task’s effective privilege level 
(EPL). The EPL is defined as being the least privi- 
leged (numerically larger) level of a task’s CPL anda 
selector’s RPL. The RPL is most commonly used to 
verify that pointers passed to an operating system 
procedure do not access data that is of higher privi- 
lege than the procedure that originated the pointer. 
Since the originator of a selector can specify any 


RPL value, the Adjust RPL (ARPL) instruction is pro- — 


vided to force the RPL bits to the originator’s CPL. 


1/0 Privilege 


The I/O privilege level (IOPL) lets the operating sys- 
tem code executing at CPL=0 define the least privi- 
leged level at which I/O instructions can be used. An 
exception 13 (General Protection Violation) is gener- 


ated if an I/O instruction is attempted when the CPL. . 


of the task is less privileged than the IOPL. The 


. . 1OPLis stored in bits 13 and 14 of the EFLAGS reg- 


ister. The following instructions cause an exception 
13 if the CPL is greater than IOPL: IN, INS, OUT, 
OUTS, STI, CLI and LOCK prefix. 


Descriptor Access 


There are basically two types of segment acces- 
sess: those involving code segments such as con- 
trol transfers, and those involving data accesses. 


Determining the ability of a task to access a seg- — 
ment involves the type of segment to be accessed, 


the instruction used, the type of descriptor used and 
CPL, RPL, and DPL as described above. 
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Any time an instruction loads a data segment regis- 
ter (DS, ES, FS, GS) the 80376 makes protection 
validation checks. Selectors loaded in the DS, ES, 
FS, GS registers must refer only to data segment or 
readable code segments. | 


_ Finally the privilege validation checks are performed. 


The CPL is compared to the EPL and if the EPL is 
more privileged than the CPL, an exception 13 (gen- 
eral protection fault) is generated. | 


The rules regarding the stack segment are slightly 
different than those involving data segments. In- 


_§tructions that load selectors into SS must refer to 


data segment descriptors for writeable data seg- 
ments. The DPL and RPL must equal the CPL of all 
other descriptor types or a privilege level violation 
will cause an exception 13. A stack not present fault 
causes an exception 12. 


PRIVILEGE LEVEL TRANSFERS | 


Inter-segment control transfers occur when a selec- 
tor is loaded in the CS register. For a typical system 
most of these transfers are simply the result of a call 


or a jump to another routine. There are five types of 


control transfers which are summarized in Table 3.2. 


_Many of these transfers result in a privilege level 
transfer. Changing privilege levels is done only by 


control transfers, using gates, task switches, and in- 


terrupt or trap gates. 


Control transfers can only occur if the operation 
which loaded the selector references the correct de- 
scriptor type. Any violation of these descriptor usage 


_rules will cause an exception 13. 


CALL GATES 


Gates provide protected indirect CALLs. One of the 
major uses of gates is to provide a secure method of 
privilege transfers within a task. Since the operating 
system defines all of the gates in a system, it can | 
ensure that all gates only allow entry into a few trust- 
ed procedures. 
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Table 3.2. Descriptor Types Used for Control Transfer 


: Descriptor Descriptor 
Control Transfer Types Operation Types Retarenced 


Intersegment within the same privilege level JMP, CALL, RET, IRET* | Code Segment | GDT/LDT 
Intersegment to the same or higher privilege level | CALL Call Gate GDT/LDT 


Interrupt within task may change CPL Interrupt Instruction, Trap or IDT 


Exception, External Interrupt 
GDT/LDT 


Interrupt Gate 
Intersegment to a lower privilege level RET, IRET* Code Segment 
(changes task CPL) 
CALL, JMP Task State GDT 
Segment 
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TASK SWITCHING 


A very important attribute of any multi-tasking oper- 
ating system is its ability to rapidly switch between 
tasks or processes. The 80376 directly supports this 
operation by providing a task switch instruction in 
hardware. The 80376 task switch operation saves 
the entire state of the machine (all of the registers, 
address space, and a link to the previous task), 
loads a new execution state, performs protection 
checks, and commences execution in the new task. 
Like transfer of control by gates, the task switch op- 
eration is invoked by executing an inter-segment 
JMP or CALL. instruction which refers to a Task 
State Segment (TSS), or a task gate descriptor in 
the GDT or LDT. An INT n instruction, exception, 
trap or external interrupt may also invoke the task 
switch operation if there is a task gate descriptor in 
the associated IDT descriptor slot. For simple appli- 
cations, the TSS and task switching may not be 
used. The TSS or task switch will not be used or 
occur if no task gates are present in the GDT, LDT 
or IDT. 


The TSS descriptor points to a segment (see Figure 
3.7) containing the entire 80376 execution state. A 
task gate descriptor contains a TSS selector. The 
limit of an 80376 TSS must be greater than 64H, and 
can be as large as 16 Mbytes. In the additional TSS 
space, the operating system is free to store addition- 
al information as the reason the task is inactive, the 
time the task has spent running, and open files be- 
longing to the task. For maximum performance, TSS 
should start on an even address. 


Each Task must have a TSS associated with it. The 
current TSS is identified by a special register in the 
80376 called the Task State Segment Register (TR). 
This register contains a selector referring to the task 
state segment descriptor that defines the current 
TSS. A hidden base and limit register associated 
with the TSS descriptor is loaded whenever TR is 
loaded with a new selector. Returning from a task is 
accomplished by the IRET instruction. When IRET is 
executed, control is returned to the task which was 
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interrupted. The current executing task’s state is 
saved in the TSS and the old task state is restored 
from its TSS. 


Several bits in the flag register and CRO register give 
information about the state of a task which is useful 
to the operating system. The Nested Task bit, NT, 
controls the function of the IRET instruction. If NT = 
0 the IRET instruction performs the regular return. If 
NT = 1, IRET performs a task switch operation 
back to the previous task. The NT bit is set or reset 
in the following fashion: 


When a CALL or INT instruction initiates a task 
switch, the new TSS will be marked busy and 
the back link field of the new TSS set to the old 
TSS selector. The NT bit of the new task is set 
by CALL or INT initiated task switches. An inter- 
rupt that does not cause a task switch will clear 
NT (The NT bit will be restored after execution 
of the interrupt handler). NT may also be set or 
cleared by POPF or IRET instructions. 


The 80376 task state segment is marked busy by 
changing the descriptor type field from TYPE 9 to 
TYPE OBH. Use of a selector that references a busy 
task state segment causes an exception 13. 


The coprocessor’s state is not automatically saved 
when a task switch occurs. The Task Switched Bit, 
TS, in the CRO register helps deal with the coproces- 
sor’s state in a multi-tasking environment. Whenever 
the 80376 switches tasks, it sets the TS bit. The 
80376 detects the first use of a processor extension 
instruction after a task switch and causes the proc- 
essor extension not available exception 7. The ex- 
ception handler for exception 7 may then decide 
whether to save the state of the coprocessor. 


The T bit in the 80376 TSS indicates that the proc- 
essor should generate a debug exception when 
switching to a task. If T = 1 then upon entry to a 
new task a debug exception 1 will be generated. 


31 30 29 28 27 26 25 24 23 22 21 2019 1817 161514131211 
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PROTECTION AND 1/0 PERMISSION BIT MAP 


The I/O instructions that directly refer to addresses 
in the processor’s I/O space are IN, INS, OUT and 
OUTS. The 80376 has the ability to selectively trap 
references to specific |/O addresses. The structure 
that enables selective trapping is the //O Permis- 
sion Bit Map in the TSS segment (see Figures 3.7 
and 3.8). The I/O permission map is a bit vector. 
The size of the map and its location in the TSS seg- 
ment are variable. The processor locates the I/O 
permission map by means of the 1/O map base field 
in the fixed portion of the TSS. The I/O map base 
field is 16 bits wide and contains the offset of the 
beginning of the |/O permission map. 


lf an 1/O instruction (IN, INS, OUT or OUTS) is en- 
countered, the processor first checks whether 
CPL < IOPL. If this condition is true, the |/O opera- 
tion may proceed. If not true, the processor checks 
the I/O permission map. 


Each bit in the map corresponds to an 0 port byte 
address; for example, the bit for port 41 is found at 
I/O map base +5 linearly, (5 < 8 = 40), bit offset 
1. The processor tests all the bits that correspond to 
the I/O addresses spanned by an I/O operation; for 
example, a double word operation tests four bits cor- 
responding to four adjacent byte addresses. If any 
tested bit is set, the processor signals a general pro- 
tection exception. If all the tested bits are zero, the 
I/O operations may proceed. 


2X CLOCK — 
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It is not necessary for the I/O permission map to 
represent all the |/O addresses. I/O addresses not 
spanned by the map are treated as if they had one- 
bits in the map. The I/O map base should be at 
least one byte less than the TSS limit and the last 
byte beyond the !/O mapping information must con- 
tain all 1’s. 


Because the I/O permission map is in the TSS seg- — 
ment, different tasks can have different maps. Thus, 
the operating system can allocate ports to a task by 
changing the I/O permission map in the task’s TSS. 


IMPORTANT IMPLEMENTATION NOTE: 
Beyond the last byte of |/O mapping information in 
the I/O permission bit map must be a byte contain- 
ing all 1’s. The byte of all 1’s must be within the 
limit of the 80376’s TSS segment (See Figure 3.7). 


4.0 FUNCTIONAL DATA 


The Intel 80376 embedded processor features a 
straightforward functional interface to the external 
hardware. The 80376 has separate parallel buses 
for data and address. The data bus is 16 bits in 
width, and bidirectional. The address bus outputs 
24-bit address values using 23 address lines and 
two-byte enable signals. 


The 80376 has two selectable address bus cycles: 
pipelined and non-pipelined. The pipelining option 
allows as much time as possible for data access by 


ADDRESS BUS ) A1—A23 


| BHE 24=BiT 


BYTE ADDRESS 
ENABLES 


BUS CYCLE DEFINITION 


| COPROCESSOR SIGNALLING 


POWER CONNECTIONS 
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Figure 4.1. Functional Signal Groups 
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starting the pending bus cycle before the present 
bus cycle is finished. A non-pipelined bus cycle 
gives the highest bus performance by executing ev- 
ery bus cycle in two processor clock cycles. For 
maximum design flexibility, the address pipelining 
option is selectable on a cycle-by-cycle basis. 


The processor’s bus cycle is the basic mechanism 
for information transfer, either from system to proc- 
essor, or from processor to system. 80376 bus cy- 
cles perform data transfer in a minimum of only two 
clock periods. On a 16-bit data bus, the maximum 
80376 transfer bandwidth at 16 MHz is therefore 
16 Mbytes/sec. However, any bus cycle will be ex- 
tended for more than two clock periods if external 
hardware withholds acknowledgement of the cycle. 


The 80376 can relinquish control of its local buses 
to allow mastership by other devices, such as direct 
memory access (DMA) channels.. When relin- 
quished, HLDA is the only output pin driven by the 
80376, providing near-complete isolation of the 


PROCESSOR CLOCK 
| PERIOD 
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processor from its system (all other output pins are 
in a float condition). 


4.1 Signal Description Overview 


Ahead is a brief description of the 80376 input and 
output signals arranged by functional groups. 


The signal descriptions sometimes refer to A.C. tim- 
ing parameters, such as “‘tos Reset Setup Time” and 
“tog Reset Hold Time.” The values of these parame- 
ters can be found in Tables 6.4 and 6.5. 


CLOCK (CLK2) 


CLK2 provides the fundamental timing for the 
80376. It is divided by two internally to generate the 
internal processor clock used for instruction execu- 
tion. The internal clock is comprised of two 
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Figure 4.2. CLK2 Signal and Internal Processor Clock 
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phases, “phase one” and “phase two”. Each CLK2 
period is a phase of the internal clock. Figure 4.2 


illustrates the relationship. If desired, the phase of » 


the internal processor clock can be synchronized to 
a known phase by ensuring the falling edge of the 
RESET signal meets the applicable setup and hold 
times tos and tog. pe | | 


DATA BUS (Dy5-Do) 


These three-state bidirectional. signals provide the 
general purpose data path between the 80376 and 
other devices. The data bus outputs are active HIGH 
and will float during bus hold acknowledge. Data bus 
reads require that read-data setup and hold times 
to1 and too be met relative to CLK2 for correct oper- 
ation. : 


ADDRESS BUS (BHE, BLE, Aj3~A;) 


These three-state outputs provide physical memory 
addresses or I/O port addresses. Az3—Aig are LOW 
during |/O transfers except for 1/O transfers auto- 
matically generated by coprocessor instructions. 
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During coprocessor |/O transfers, Aoo—Aj4¢ are driv- 
en LOW, and Agog is driven HIGH so that this ad- 
dress line can be used by external logic to generate 
the coprocessor select signal. Thus, the 1/O address 
driven by the 80376 for coprocessor commands is 
8000F 8H, and the I/O address driven by the 80376 
processor for coprocessor data is 8000FCH or 
8000FEH. So | 


The address bus is capable of addressing 16 Mbytes 
of physical memory space (OQ0000H through 
OFFFFFFH), and 64 Kbytes of I/O address space 
(OOO0000H through OOFFFFH) for programmed I/O. 
The address bus is active HIGH and will float during 
bus hold acknowledge. 


The Byte Enable outputs BHE and BLE directly indi- 
cate which bytes of the 16-bit data bus are involved 
with the current transfer. BHE applies to Dy5-Dg 
and BLE applies to D7—Do. If both BHE and BLE are 


~ asserted, then 16 bits of data are being transferred. 


See Table 4.1 for a complete decoding of these sig- 
nals. The byte enables are active LOW and will float 
during bus hold acknowledge. 


Table 4.1. Byte Enable Definitions | 


Come [ee Cuneton 
[0 | 0 | Wouter 
[0 | 1] Bye Transfer on Upper Bye ofthe Data Bus Dig-De 
[1 | 0 | Byte Transfer on Lower Byte of the Data Bus, D7-Dp 
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BUS CYCLE DEFINITION SIGNALS 
(W/R, D/C, M/IO, LOCK) 


These three-state outputs define the type of bus cy- 
cle being performed: W/R distinguishes between 
write and read cycles, D/C distinguishes between 
data and control cycles, M/IO distinguishes between 
memory and I/O cycles, and LOCK distinguishes be- 
tween locked and unlocked bus cycles. All of these 
signals are active LOW and will float during bus ac- 
knowledge. 


The primary bus cycle definition signals are W/R, 
D/C and M/IO, since these are the signals driven 
valid as ADS (Address Status output) becomes ac- 
tive. The LOCK signal is driven valid at the same 
time the bus cycle begins, which due to address 
pipelining, could be after ADS becomes active. Ex- 
act bus cycle definitions, as a function of W/R, D/C 
and M/IO are given in Table 4.2. 


LOCK indicates that other system bus masters are 
not to gain control of the system bus while it is ac- 
tive. LOCK is activated on the CLK2 edge that be- 
gins the first locked bus cycle (i.e., it is not active at 
the same time as the other bus cycle definition pins) 
and is deactivated when ready is returned to the end 
of the last bus cycle which is to be locked. The be- 
ginning of a bus cycle is determined when READY is 
returned in a previous bus cycle and another is 
pending (ADS is active) or the clock in which ADS is 
driven active if the bus was idle. This means that it 
follows more closely with the write data rules when it 
is valid, but may cause the bus to be locked longer 
than desired. The LOCK signal may be explicitly acti- 
vated by the LOCK prefix on certain instructions. 
LOCK is always asserted when executing the XCHG 
instruction, during descriptor updates, and during the 
interrupt acknowledge sequence. 


- Table 4.2. Bus Cycle Definition 


[0 | INTERRUPT ACKNOWLEDGE [Yes 
[poesNotoour | 
[Wo paTAREAD SNe 
ne 
[No 


1 
a 


376 EMBEDDED PROCESSOR 


I/O DATA WRITE 
MEMORY CODE READ 


HALT: SHUTDOWN: 
Address = 2 Address = 0 
BHE = 1 BHE = 1 


BLE = 0 


1 | 0 | MEMORY DATA READ Some Cycies__| 
MEMORY DATA WRITE Some Cycles 


PRELIMINARY 


BUS CONTROL SIGNALS 
(ADS, READY, NA) 


The following signals allow the processor to indicate 
when a bus cycle has begun, and allow other system 
hardware to control address pipelining and bus cycle 
termination. 


Address Status (ADS) 


This three-state output indicates that a valid bus cy- 
cle definition and address (W/R, D/C, M/IO, BHE, 
BLE and Ag3-A;) are being driven at the 80376 
pins. ADS is an active LOW output. Once ADS is 
driven active, valid address, byte enables, and defi- 
nition signals will not change. In addition, ADS will 
remain active until its associated bus cycle begins 
(when READY is returned for the previous bus cycle 
when running pipelined bus cycles). ADS will float 
during bus hold acknowledge. See sections Non- 
Pipelined Bus Cycles and Pipelined Bus Cycles 
for additional information on how ADS is asserted 
for different bus states. 


Transfer Acknowledge (READY) 


This input indicates the current bus cycle is com- 
plete, and the active bytes indicated by BHE and 
BLE are accepted or provided. When READY is 
sampled active during a read cycle or interrupt ac- 
knowledge cycle, the 80376 latches the input data 
and terminates the cycle. When READY is sampled 
active during a write cycle, the processor terminates 
the bus cycle. 


[teckea? 


Yes 
No 


BLE = 0 
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READY is ignored on the first bus state of all bus 

cycles, and sampled each bus state thereafter until 

asserted. READY must eventually be asserted to ac- 

knowledge every bus cycle, including Halt Indication 

and. Shutdown Indication bus cycles. When being 

sampled, READY must always meet:.setup and hold 
times tig and too for correct operation. 


Next Address Request (NA) - 


This is used to request pipelining. This input indi- 
cates the system is prepared to accept new values 
of BHE, BLE, Ao3-A;, W/R, D/C and M/I0 from the 
80376 even if the end of the current cycle is not 
being acknowledged on READY. If this input is ac- 
tive when sampled, the next bus cycle’s address and 
status signals are driven onto the bus, provided the 
next bus request is already pending internally. NA is 


ignored in clock cycles in which ADS or READY is 


activated. This signal is active LOW and must satisfy 
setup and hold times t15 and tyg for correct opera- 
tion. See Pipelined Bus Cycles and Read and 
Write Cycles for additional information. 


BUS ARBITRATION SIGNALS (HOLD, HLDA) 


This section describes the mechanism by which the 
processor relinquishes control of its local buses 
when requested by another bus master device. See 
Entering and Exiting Hold een for addi- 
tional information. 


Bus Hold Request (HOLD) 


This input indicates some device other than the 
80376 requires bus mastership. When_control is 
granted, the 80376 floats Ao3-A;, BHE, BLE, 
D45-Dp, LOCK, M/IO, D/C, W/R and ADS, and 
then activates HLDA, thus entering the bus hold ac- 
knowledge state. The local bus will remain granted 
to the requesting master until HOLD becomes inac- 
tive. When HOLD becomes inactive, the 80376 will 
deactivate HLDA and drive the local bus (at the 
same time), thus ene the hold acknowledge 
condition. 


HOLD must remain asserted as long as any other 
device is a local bus master. External pull-up resis- 
tors may be required when in the hold acknowledge 
state since none of the 80376 floated outputs have 
internal pull-up resistors. See Resistor Recommen- 
dations for additional information. HOLD is not rec- 
ognized while RESET is active but is recognized dur- 


ing the time between the high-to-low transistion of. 


RESET and the first instruction fetch. If RESET is 
asserted while HOLD is asserted, RESET has priori- 
ty and places the bus into an idle state, rather than 
the hold acknowledge (high-impedance) state. 
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HOLD is a level-sensitive, active HIGH, synchronous . 
input. HOLD signals must always meet setup .and 
hold times to3 and to, for correct operation. 


Bus Hold Acknowledge (HLDA) 


When active (HIGH), this output indicates the 80376 
has relinquished control of its local bus in response 
to. an asserted HOLD signal, and is in the bus Hold 
Acknowledge State. 


The Bus Hold Acknowledge state offers near-com- 
plete signal isolation. In the Hold Acknowledge 
state, HLDA is the only signal being driven by the 
80376. The other output signals or bidirectional sig- 
nals (Dj5-Do, BHE, BLE, Az3—A;, W/R, D/C, M/IO, 
LOCK and ADS) are in a high-impedance state so 
the requesting bus master may control them. These 
pins remain OFF throughout the time that HLDA re- 
mains active (see Table 4.3). Pull-up resistors may 
be desired on several signals to avoid spurious ac- 
tivity when no bus master is driving them. See 


_ Resistor Recommendations for additional informa- 


tion. 


When the HOLD signal is made inactive, the 80376 
will deactivate HLDA and drive the bus. One rising 
edge on the NMI input is remembered for processing 
after the HOLD input is negated. : 


Table 4.3. Output Pin State during HOLD 


HLDA 


LOCK, M/IO, D/C, W/R, 


ADS, Ao3—A;, BHE, BLE, 
Di5-Do 


Hold Latencies 


The maximum possible HOLD latency depends on 


_ the software being executed. The actual HOLD la- 


tency at any time depends on the current bus activi- 


_ty, the state of the LOCK signal (internal to the CPU) 


activated by the LOCK prefix, and interrupts. The 
80376 will not honor a HOLD request until the cur- 
rent bus operation is complete. 


The 80376 breaks 32-bit data or |/O accesses into 2 


internally locked 16-bit bus cycles; the LOCK signal 


is not asserted. The 80376 breaks unaligned 16-bit 
or 32-bit data or I/O accesses into 2 or 3 internally 
locked 16-bit bus cycles. Again the LOCK signal is 
not asserted but a HOLD request will not be recog- 
nized until the end of the entire transfer. 
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Wait states affect HOLD latency. The 80376 will not 
honor a HOLD request until the end of the current 
bus operation, no matter how many wait states are 
required. Systems with DMA where data transfer is 
Critical must insure that READY returns sufficiently 
soon. 


COPROCESSOR INTERFACE SIGNALS 
(PEREQ, BUSY, ERROR) 


In the following sections are descriptions of signals 
dedicated to the numeric coprocessor interface. In 
addition to the data bus, address bus, and bus cycle 
definition signals, these following signals control 
communication between the 80376 and_ the 
80387SX processor extension. 


Coprocessor Request (PEREQ) 


When asserted (HIGH), this input signal indicates a 
coprocessor request for a data operand to be trans- 
ferred to/from memory by the 80376. In response, 
the 80376 transfers information between the co- 
processor and memory. Because the 80376 has in- 


ternally stored the coprocessor opcode being exe- | 


cuted, it performs the requested data transfer with 
the correct direction and memory address. 


PEREQ is a level-sensitive active HIGH asynchro- 
nous signal. Setup and hold times, tag and tgo, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This signal is 
provided with a weak internal pull-down resistor of 
around 20 KQ to ground so that it will not float active 
when left unconnected. 


Coprocessor Busy (BUSY) 


When asserted (LOW), this input indicates the co- 
processor is still executing an instruction, and is not 
yet able to accept another. When the 80376 en- 
counters any coprocessor instruction which oper- 
ates on the numerics stack (e.g. load, pop, or arith- 
metic operation), or the WAIT instruction, this input 
is first automatically sampled until it is seen to be 
inactive. This sampling of the BUSY input prevents 
-overrunning the execution of a previous coprocessor 
instruction. 
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The F(N)INIT, F(N)CLEX coprocessor instructions 
are allowed to execute even if BUSY is active, since 
these instructions are used for coprocessor initializa- 
tion and exception-clearing. 


BUSY is an active LOW, level-sensitive asynchro- 
nous signal. Setup and hold times, tog and to, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This pin is pro- — 
vided with a weak internal pull-up resistor of around 
20 KX. to Voc so that it will not float active when left 
unconnected. 


BUSY serves an additional function. If BUSY is sam- 
pled LOW at the falling edge of RESET, the 80376 
processor performs an internal self-test (see Bus 
Activity During and Following Reset. If BUSY is 
sampled HIGH, no self-test is performed. 


Coprocessor Error (ERROR) 


When asserted (LOW), this input signal indicates 
that the previous coprocessor instruction generated 
a coprocessor error of a type not masked by the 
coprocessor’s control register. This input is automat- 
ically sampled by the 80376 when a coprocessor 
instruction is encountered, and if active, the 80376 
generates exception 16 to access the error-handling 
software. 


Several coprocessor instructions, generally those 
which clear the numeric error flags in the coproces- 
sor or save coprocessor state, do execute without 
the 80376 generating exception 16 even if 
ERROR is active. These instructions are FNINIT, 
FNCLEX, FNSTSW, FNSTSWAX, FNSTCW, 
FNSTENV and FNSAVE. 


ERROR is an active LOW, level-sensitive asynchro- 
nous signal. Setup and hold times tog and to, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. This pin is pro- 
vided with a weak internal pull-up resistor of around 
20 KQ. to Vcc so that it will not float active when left 
unconnected. 
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INTERRUPT SIGNALS (INTR, NMI, RESET) 


The following descriptions cover inputs that can in- 


terrupt or suspend execution of the processor’s cur- 
rent instruction stream. 


Maskable Interrupt Request (INTR) 


~ When asserted, this input indicates a request for in- 
terrupt service, which can be masked by the 80376 
Flag Register IF bit. When the 80376 responds to 
the INTR input, it performs two interrupt acknowl- 
edge bus cycles and, at the end of the second, 
latches an 8-bit interrupt vector on D7—Dg to identify 
the source of the interrupt. 


INTR is an active HIGH, iavel! sensitive asynchro- 
nous signal. Setup and hold times, to7 and tog, rela- 
tive to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. To assure rec- 
ognition of an INTR request, INTR should remain 
active until the first interrupt acknowledge bus cycle 
begins. INTR is sampled at the beginning of every 
instruction. In order to be recognized at a particular 
instruction boundary, INTR must be active at least 
. eight CLK2 clock periods before the beginning of the 
execution of the instruction. If recognized, the 80376 
will begin execution of the interrupt. 


Non-Maskable Interrupt-Request (NMI) 


This input indicates a request for interrupt service 
which cannot be masked by software. The non- 
maskable interrupt request is always processed ac- 
cording to the pointer or gate in slot 2 of the interrupt 
table. Because of the fixed NMI slot assignment, no 


interrupt acknowledge cycles are performed when. 


processing NMI. 


NMI is an active HIGH, rising edge-sensitive asyn- 
chronous signal. Setup and hold times, te7 and tog, 
relative to the CLK2 signal must be met to guarantee 
recognition at a particular clock edge. To assure rec- 
ognition of NMI, it must be inactive for at least eight 


CLK2 periods, and then be active for at least eight 


CLK2 periods before the beginning of the execution 
of an instruction. 


Once NMI processing has begun, no additional 
NMl’s are processed until after the next IRET in- 
struction, which is typically the end of the NMI serv- 
ice routine. If NMI is re-asserted prior to that time, 
however, one rising edge on NMI will be remem- 
bered for processing after executing the next IRET 
instruction. | 


Interrupt Latency 


The time that elapses before an interrupt request is 
serviced (interrupt latency) varies according to sev- 


eral factors. This delay must be taken into account 


by the interrupt source. Any of the following factors 
can affect interrupt latency: 


1. If interrupts are masked, and INTR request will 
not be recognized until interrupts are reenabled. 


2. If an NMI is currently being serviced, an incoming 
NMI request will not be recognized until the 80376 
encounters the IRET instruction. 


3. An interrupt request is recognized only on an in- 
struction boundary of the 80376 Execution Unit 
except for the following cases: 


— Repeat string instructions can be interrupted 
after each iteration. 


— If the instruction loads the Stack Segment reg- 
ister, an interrupt is not processed until after 
the following instruction, which should be an 
ESP load. This allows the entire stack pointer 
to be loaded without interruption. 


— If an instruction sets the interrupt flag (enabling 
_ interrupts), an interrupt is not processed until 
after the next instruction. 


The longest latency occurs when the interrupt re- 
quest arrives while the 80376 processor is exe- 
cuting a long instruction such as multiplication, di- 
vision or a task-switch. 


4. Saving the Flags register and CS:EIP registers. 


5. lf interrupt service routine requires a task switch, 
_ time must be allowed for the task switch. 


6. If the interrupt service routine saves registers that 
are not automatically saved ou the 80376. 


RESET 


This input signal suspends any operation in progress 
and places the 80376 in a known reset state. The 
80376 is reset by asserting RESET for 15 or more 
CLK2 periods (80 or more CLK2 periods before re- 
questing self-test). When RESET is active, all other 
input pins except FLT are ignored, and all other bus 
pins are driven to an idle bus state as shown in Ta- 
ble 4.4. If RESET and HOLD are both active at a 
point in time, RESET takes priority even if the 80376 
was in a Hold Acknowledge state prior to RESET 
active. . 


RESET is an active HIGH, level-sensitive synchro- 
nous signal. Setup and hold times, tes and tog, must 
be met in order to assure proper operation of the 
80376. 
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Table 4.4. Pin State (Bus Idle) during RESET 


4.2 Bus Transfer Mechanism 


All data transfers occur as a result of one or more 
bus cycles. Logical data operands of byte and word 
lengths may be transferred without restrictions on 
physical address alignment. Any byte boundary may 
be used, although two physical bus cycles are per- 
formed as required for unaligned operand transfers. 


The 80376 processor address signals are designed 
to simplify external system hardware. BHE and BLE 
provide linear selects for the two bytes of the 16-bit 
data bus. 


Byte Enable outputs BHE and BLE are asserted 
when their associated data bus bytes are involved 
with the present bus cycle, as listed in Table 4.5. 


Table 4.5. Byte Enables and Associated 
Data and Operand Bytes 


= Enable Associated Data Bus Signals 


D,5-Dg (Byte 1—Most Significant) 
D7-—Dpo (Byte 0—Least Significant) 
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Each bus cycle is composed of at least two bus 
states. Each bus state requires one processor clock 
period. Additional bus states added to a single bus 
cycle are called wait states. See Bus Functional 
Description for additional information. 


4.3 Memory and I/O Spaces 


Bus cycles may access physical memory space or 
1/O space. Peripheral devices in the system may ei- 
ther be memory-mapped, or |/O-mapped, or both. 
As shown in Figure 4.3, physical memory addresses 
range from OOOOO0OH to OFFFFFFH (16 Mbytes) and . 
I/O addresses from OOO000H to OOFFFFH 
(64 Kbytes). Note the 1/O addresses used by the 
automatic I/O cycles for coprocessor communica- 
tion are 8000F8H to 8000FFH, beyond the address 
range of programmed I/O, to allow easy generation 
of a coprocessor chip select signal using the Az3 
and M/IO signals. 


OPERAND ALIGNMENT 


With the flexibility of memory addressing on the 
80376, it is possible to transfer a logical operand 
that spans more than one physical Dword or word of 
memory or !/O. Examples are 32-bit Dword or 16-bit 
word operands beginning at addresses not evenly 
divisible by 2. 


Operand alignment and size dictate when multiple 
bus cycles are required. Table 4.6 describes the 
transfer cycles generated for all combinations of log- 
ical operand lengths and alignment. 


Table 4.6. Transfer Bus Cycles 
for Bytes, Words and Dwords 


ae of Logical anaes 


Key: b = byte transfer 
w = word transfer 
| = low-order portion 
m = mid-order portion 
x = don’t care 
h = high-order portion 
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NOTE: 


generate a coprocessor select signal. 


_ Figure 4.3. Physical Memory and I/O Spaces 


4.4 Bus Functional Description 


The 80376 has separate, parallel buses for data and 
address. The data bus is 16 bits in width, and bidi- 
rectional. The address bus provides a 24-bit value 
using 23 signals for the 23 upper-order address bits 
and 2 Byte Enable signals to directly indicate the 
active bytes. These buses are interpreted and con- 
trolled by several definition signals. 


The definition of each bus cycle is given by three 
signals: M/IO, W/R and D/C. At the same time, a 
valid address is present on the byte enable signals, 


BHE and BLE, and the other address signals | 


Ao3-A1. A status signal, ADS, indicates when the 
80376 issues a new bus cycle definition and ad- 
dress. : 


Collectively, the address bus, data bus and all asso- 
ciated control signals are referred to simply as “‘the 
bus’. When active, the bus performs one of the bus 
cycles below: 


1. Read from memory space 

2. Locked read from memory space 
3. Write to memory space 

4. Locked write to memory space 
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Since Ag3 is HIGH during automatic communication with coprocessor, Ag3 HIGH and M/IO LOW can be used to easily 


5. Read from I/O space (or coprocessor) 
6. Write to |/O space (or coprocessor) 


7. Interrupt acknowledge (always locked) 


8. Indicate halt, or indicate shutdown 


Table 4.2 shows the encoding of the bus cycle defi- 
nition signals for each bus cycle..See Bus Cycle 
Definition Signals for additonal information. 


. When the 80376 bus is not performing one of the 


activities listed above, it is either Idle or in the Hold 
Acknowledge state, which may be detected by ex- 
ternal circuitry. The.idle state can be identified by the 
80376 giving no further assertions on its address 
strobe output (ADS) since the beginning of its most 
recent bus cycle, and the most recent bus cycle hav- 
ing been terminated. The hold acknowledge state is 
identified by the 80376 asserting its hold acknowl- 
edge (HLDA) output. 


The shortest time unit of bus activity is a bus state. A 
bus state is one processor clock period (two CLK2 
periods) in duration. A complete data transfer occurs 
during a bus cycle, composed of two or more bus 
states. 
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Figure 4.4. Fastest Read Cycles with Non-Pipelined Timing 


The fastest 80376 bus cycle requires only two bus 
states. For example, three consecutive bus read cy- 
cles, each consisting of two bus states, are shown 
by Figure 4.4. The bus states in each cycle are 
named T1 and T2. Any memory or |/O address may 
be accessed by such a. two-state bus cycle, if the 
external hardware is fast enough. 


Every bus cycle continues until it is acknowledged 
by the external system hardware, using the 80376 
READY input. Acknowledging the bus cycle at the 
end of the first T2 results in the shortest bus cycle, 
requiring only T1 and T2. If READY is not immedi- 
ately asserted however, T2 states are repeated in- 
definitely until the READY input is sampled active. 


The pipelining option provides a choice of bus cycle 
timings. Pipelined or non-pipelined cycles are 


selectable on a cycle-by-cycle basis with the Next 
Address (NA) input. 


When pipelining is selected the address (BHE, BLE 
and Ao3-A,) and definition (W/R, D/C, M/IO and 
LOCK) of the next cycle are available before the end 
of the current cycle. To signal their availability, the 
80376 address status output (ADS) is asserted. Fig- 
ure 4.5 illustrates the fastest read cycles with BPE: 
lined timing. | 


Note from Figure 4.5 the fastest bus cycles using 
pipelining require only two bus states, named T1P 
and T2P. Therefore pipelined cycles allow the same 
data bandwidth as non-pipelined cycles, but ad- 
dress-to-data access time is increased by one 
T-state time compared to that of a non-pipelined cy- 
cle. 
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_ Figure 4.5. Fastest Read Cycles with Pipelined Timing 


READ AND WRITE CYCLES 


Data transfers occur as a result of bus cycles, classi- 
fied as read or write cycles. During read cycles, data 
is transferred from an external device to the proces- 
sor. During write cycles, data is transferred from the 
processor to an external device. 


Two choices of bus cycle timing are dynamically se- 
lectable: non-pipelined or pipelined. After an idle bus 
state, the processor always uses non-pipelined tim- 
ing. However the NA (Next Address) input may be 
asserted to select pipelined timing for the next bus 
cycle. When pipelining is selected and the 80376 
has a bus request pending internally, the address 
and definition of the next cycle is made available 
even before the current bus cycle is acknowledged 
by READY. 


Terminating a read or write cycle, like any bus cycle, 
requires acknowledging the cycle by asserting the 
READY input. Until acknowledged, the processor in- 
serts wait states: into the bus cycle, to allow adjust- 


ment for the speed of any external device. External 


hardware, which has decoded the address and bus 
cycle type, asserts the READY input at the appropri- 
ate time. 


At the end of the second bus state within the bus 
cycle, READY is sampled. At that time, if external 
hardware acknowledges the bus cycle by asserting 
READY, the bus cycle terminates as shown in Figure 
4.6. lf READY is negated as in Figure 4.7, the 80376 
executes another bus state (a wait state) and 
READY is sampled again at the end of that state. 
This continues indefinitely until the cycle is acknowl- 
edged by READY asserted. 


When the current cycle is acknowledged, the 80376 
terminates it. When a read cycle is acknowledged, 
the 80376 latches the information present at its data 
pins. When a write cycle is acknowledged, the write 
data of the 80376 remains valid throughout phase 
one of the next bus state, to provide write data hold 
time. 
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Idle states are shown here for diagram variety only. Write cycles are not always followed by an idle state. An active bus 


cycle can immediately follow the write cycle. 


Figure 4.6. Various Non-Pipelined Bis Cycles (Zero Wait States) 


Non-Pipelined Bus Cycles 


Any bus cycle may be performed with non-pipelined 
timing. For example, Figure 4.6 shows a mixture of 
non-pipelined read and write cycles. Figure 4.6 
shows that the fastest possible non-pipelined cycles 
have two bus states per bus cycle. The states are 
named T1 and T2. In phase one of T1, the address 
signals and bus cycle definition signals are driven 
valid and, to signal their availability, address strobe 
(ADS) is simultaneously asserted. 


During read or write cycles, the data bus behaves as 
follows. If the cycle is a read, the 80376 floats its 
data signals to allow driving by the external device 
being addressed. The 80376 requires that all data 
bus pins be at a valid logic state (HIGH or LOW) 
at the end of each read cycle, when READY is 
asserted. The system MUST be designed to 
meet this requirement. !f the cycle is a write, data 
signals are driven by the 80376 beginning in phase 
two of T1 until phase one of the bus state following 
cycle acknowledgement. | 
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Idle states are shown here for diagram variety only. Write cycles are not always followed by an idle state. An active bus 


cycle can immediately follow the write cycle. 


Figure 4.7. Various Non-Pipelined Bus Cycles (Various Number of Wait States) | 


Figure 4.7 illustrates non-pipelined bus cycles with 
one wait state added to Cycles 2 and 3. READY is 
sampled inactive at the end of the first T2 in Cycles 
2 and 3. Therefore Cycles 2 and 3 have T2 repeated 
again. At the end of the second T2, READY is sam- 
pled active. 


When address pipelining is not used, the address 
_ and bus cycle definition remain valid during all wait 
states. When wait states are added and it is desir- 
able to maintain non-pipelined timing, it is necessary 
to negate NA during each T2 state except the 


last one, as shown in Figure 4.7, Cycles 2 and 3. If 


NA is sampled active during a T2 other than the last 


one, the next state would be T2I or T2P instead of 
another T2. 


When address pipelining is not used, the bus states 
and transitions are completely illustrated by Figure 
4.8. The bus transitions between four possible 
states, T1, T2, Tj, and Tp. Bus cycles consist of T1 
and T2, with T2 being repeated for wait states. Oth- 
erwise the bus may be idle, Tb or in the hold ac- 
knowledge state Tp. 
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T1—first clock of a non-pipelined bus cycle (80376 drives new address and asserts ADS). 
T2—subsequent clocks of a bus cycle when NA has not been sampled asserted in the current bus cycle. 


Ti—idle state. 
Th—hold acknowledge state (80376 asserts HLDA). 


The fastest bus cycle consists of two states: T1 and T2. 


Four basic bus states describe bus operation when not using pipelined address. 


Figure 4.8. 80376 Bus States (Not Using Pipelined Address) 


Bus cycles always begin with T1. T1 always leads to 
T2. If a bus cycle is not acknowledged during T2 and 
NA is inactive, T2 is repeated. When a cycle is ac- 
knowledged during T2, the following state will be T1 
of the next bus cycle if a bus request is pending 
internally, or T; if there is no bus request pending, or 
Th if the HOLD input is being asserted. 


Use of pipelining allows the 80376 to enter three 
additional bus states not shown in Figure 4.8. Figure 
4.12 is the complete bus state diagram, including 
pipelined cycles. 


Pipelined Bus Cycles 


Pipelining is the option of requesting the address 
and the bus cycle definition of the next inter- 


nally pending bus cycle before the current bus cycle 
is acknowledged with READY asserted. ADS is as- 
serted by the 80376 when the next address is is- 
sued. The pipelining option is controlled on a cycle- 
by-cycle basis with the NA input signal. | 


Once a bus cycle is in progress and the current ad- 
dress has been valid for at least one entire bus 
state, the NA input is sampled at the end of every 
phase one until the bus cycle is acknowledged. Dur- 
ing non-pipelined bus cycles NA is sampled at the 
end of phase one in every T2. An example is Cycle 2 
in Figure 4.9, during which NA is sampled at the end 
of phase one of every T2 (it was asserted once dur- 
ing the first T2 and has no further effect during that 
bus cycle). 
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Following any idle bus state (Ti), bus cycles are non-pipelined. Within non-pipelined bus cycles, NA is only sampled 
during wait states. Therefore, to begin pipelining during a group of non-pipelined bus cycles requires a non-pipelined 


cycle with at least one wait state (Cyicle 2 above). 


Figure 4.9. Transitioning to Pipelining during Burst of Bus Cycles 


lf NA is sampled active, the 80376 is free to drive the 
address and bus cycle definition of the next bus cy-. 
cle, and assert ADS, as soon as it has a bus request 
internally pending. It may drive the next address as 
early as the next bus state, whether the current bus 
cycle is acknowledged at that time or not. 


Regarding the details of pipelining, the 80376 has 

the following characteristics: 

1. The next address and status may appear as early 
as the bus state after NA was sampled active (see 
Figures 4.9 or 4.10). In that case, state T2P is 
entered immediately. However, when there is not 


an internal bus request already pending, the next 3 


address and status will not be available immedi- 
ately after NA is asserted and T2l is entered in- 
stead of T2P (see Figure 4.11 Cycle 3). Provided 
the current bus cycle isn’t yet acknow- 
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ledged by READY asserted, T2P will be entered. 
as soon as the 80376 does drive the next address 
and status. External hardware should therefore 
observe the ADS output as confirmation the next 
address and status are ae being driven on 
the bus. 


. Any address and status which are validated by a 


pulse on the 80376 ADS output will remain stable 
on the address pins for at least two processor 
clock periods. The 80376 cannot produce a new 
address and status more frequently than every 
two processor clock periods (see Figures 4.9, 
4.10 and 4.11). 


. Only the address and bus cycle definition of the 


very next bus cycle is available. The pipelining ca-. 
pability cannot look further than one bus cycle | 


_ ahead (see Figure 4.11, Cycle 1). 
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Following any idle bus state (Ti) the bus cycle is always non-pipelined and NA is only sampled during wait states. To 
start, address pipelining after an idle state requires a non-pipelined cycle with at least one wait state (cycle 1 above). 
The pipelined cycles (2, 3, 4 above) are shown with various numbers of wait states. 


Figure 4.10. Fastest Transition to Pipelined Bus Cycle Following Idle Bus State 


The complete bus state transition diagram, including 
pipelining is given by Figure 4.12. Note it is a super- 
set of the diagram for non-pipelined only, and the 
three additional bus states for pipelining are drawn 
in bold. 


The fastest bus cycle with pipelining consists of just 
two bus states, T1P and T2P (recall for non-pipe- 
lined it is T1 and T2). T1P is the first bus state of a 
pipelined cycle. | 


Initiating and Maintaining Pipelined Bus Cycles 


Using the. state diagram Figure 4.12, observe the 
transitions from an idle state, Tj, to the beginning of 


a pipelined bus cycle T1P. From an idle state, Tj, the 
first bus cycle must begin with T1, and is therefore a 
non-pipelined bus cycle. The next bus cycle will be 
pipelined, however, provided NA is asserted and the 
first bus cycle ends in a T2P state (the address and 
status for the next bus cycle is driven during T2P). 
The fastest path from an idle state to a pipelined bus — 
cycle is shown in bold below: 


Tj, Tj, T1-T2-T2P, T1P-T2P, 
idle non-pipelined pipelined 
states cycle cycle 
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_ Figure 4.11. Details of Address Pipelining during Cycles with Wait States 


T1-T2-T2P are the states of the bus cycle that es- The transition to pipelined address is shown func- 
tablishes address pipelining for the next bus cycle, tionally by Figure 4.10, Cycle 1. Note that Cycle 1 is 
which begins with T1P. The same is true after a bus used to transition into pipelined address timing for 


hold state, shown below: | the subsequent Cycles 2, 3 and 4, which are pipe- 
a lined. The NA input is asserted at the appropriate 
Th Th Ths T1-T2-T2P, T1P-T2P, time to select address pipelining for Cycles 2, 3 and 
: | ; 4. 


hold aknowledge non-pipelined _ pipelined | = | 
states cycle cycle Once a bus cycle is in progress and the current ad- 


dress and status has been valid for one entire bus 
state, the NA input is sampled at the end of every 
phase one until the bus cycle is acknowledged. 
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Bus States: . aes 

T1—first clock of a non-pipelined bus cycle (80376 drives new address, status and asserts ADS). 

T2—subsequent clocks of a bus cycle when NA has not been sampled asserted in the current bus cycle. 
T2I—subsequent clocks of a bus cycle when NA has been sampled asserted in the current bus cycle but there is not yet 
an internal bus request pending (80376 will not drive new address, status or assert ADS). 

T2P—subsequent clocks of a bus cycle when NA has been sampled asserted in the current bus cycle and there is an 
internal bus request pending (80376 drives new address, status and asserts ADS). 

T1iP—first clock of a pipelined bus cycle. 

Ti—idle state. 

Th—hold acknowledge state (80376 asserts HLDA). 


Asserting NA for pipelined bus cycles gives access to three more bus states: T2I, T2P and T1P. 
Using pipelining the fastest bus cycle consists of T1P and T2P. 


Figure 4.12. 80376 Processor Complete Bus States (Including Pipelining) 
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Sampling begins in T2 during Cycle 1 in Figure 4.10. 
Once NA is sampled active during the current cycle, 
the 80376 is free to drive a new address and bus 
cycle definition on the bus as early as the next bus 
state. In Figure 4.10, Cycle 1 for example, the next 
address and status is driven during state T2P. Thus 
Cycle 1 makes the transition to pipelined timing, 
since it begins with T1 but ends with T2P. Because 
the address for Cycle 2 is available before Cycle 2 
begins, Cycle 2 is called a pipelined bus cycle, and it 
begins with T1P. Cycle 2 begins as soon as READY 
asserted terminates Cycle 1. 


Examples of transition bus cycles are Figure 4.10, 


‘Cycle 1 and Figure 4.9, Cycle 2. Figure 4.10 shows | 


transition during the very first cycle after an idle bus 
state, which is the fastest possible transition into ad- 
dress pipelining. Figure 4.9, Cycle 2 shows a tran- 
sition cycle occurring during a burst of bus cycles. In 
any case, a transition cycle is the same whenever it 
occurs: it consists at least of T1, T2 (NA is asserted 
at that time), and T2P (provided the 80376 has an 
internal bus request already pending, which it almost 
always has). T2P states are repeated if wait states 
are added to the cycle. 


‘Note that only three states (T1, T2 and T2P) are 


required in a bus cycle performing a transition from. 


non-pipelined into pipelined timing, for example Fig- 


ure 4.10, Cycle 1. Figure 4.10, Cycles 2, 3 and 4. | 


show that pipelining can be maintained with two- 


“state bus cycles consisting only of T1P and T2P. 


Once a pipelined bus cycle is in progress, pipelined 
timing is maintained for the next. cycle by asserting 
NA and detecting that the 80376 enters T2P during 
the current bus cycle. The current bus cycle must 
end in state T2P for pipelining to be maintained in 
the next cycle. T2P is identified by the assertion of 


_ADS. Figures 4.9 and 4.10 however, each show 
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pipelining ending after Cycle 4 because Cycle 4 
ends in T2I. This indicates the 80376 didn’t have an 


internal bus request prior to the acknowledgement 


of Cycle 4. If a cycle ends with a T2 or T2l, the next 


_cycle will not be pipelined. 


Realistically, pipelining is almost always maintained 
as long as NA is sampled asserted. This is so be- 


_ cause in the absence of any other request, a code 


prefetch request is always internally pending until 
the instruction decoder and code prefetch queue are 
completely full. Therefore pipelining is maintained 
for long bursts of bus cycles, if the bus is available 
(i.e., HOLD inactive) and NA is sampled active in 
each of the bus cycles. 


INTERRUPT ACKNOWLEDGE (INTA) CYCLES 


In repsonse to an interrupt request on the INTR in- 
put when interrupts are enabled, the 80376 performs 
two interrupt acknowledge cycles. These bus cycles 
are similar to read cycles in that bus definition sig- 
nals define the type of bus activity taking place, and 
each cycle continues until acknowledged by READY 
sampled active. 3 


The state of Apo distinguishes the first and second 


interrupt acknowledge cycles. The byte address 
driven during the first interrupt acknowledge cycle is 


_ 4 (Ao3-A3, Ay, BLE LOW, Ao and BHE HIGH). The 


byte address driven during the second interrupt_ac- 
knowledge cycle is 0 (Ao3-A;, BLE LOW and BHE 
HIGH). 


The LOCK output is asserted from the beginning of 
the first interrupt acknowledge cycle until the end of 
the second interrupt acknowledge cycle. Four idle 
bus states, Tj, are inserted by the 80376 between 
the two interrupt acknowledge cycles for compatibil- 
ity with the interrupt specification TRur,_ of the 
8259A Interrupt Controller and the 82370 Integrated 
Peripheral. 
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Interrupt Vector (0-255) is read on DO-D7 at end of second Interrupt Acknowledge bus cycle. 
| Because each Interrupt Acknowledge bus cycle is followed by idle bus states, asserting NA has: no practical effect. 
Choose the approach which is simplest for your system hardware design. 


Figure 4.13. interrupt Acknowledge Cycies 


During both interrupt acknowledge cycles, Di5—-Do 
float. No data is read at the end of the first interrupt 
acknowledge cycle. At the end of the second inter- 
rupt acknowledge cycle, the 80376 will read an ex- 
ternal interrupt vector from D7—Do of the data bus. 
The vector indicates the specific interrupt number 
(from 0-255) requiring service. 


HALT INDICATION CYCLE 


The 80376 execution unit halts as a result of execut- 
ing a HLT instruction. Signaling its entrance into the 
halt state, a halt indication cycle is performed. The 
halt indication cycle is identified by the state of the 
bus definition signals and a byte address of 2. See 
the Bus Cycle Definition Signals section. The halt 
indication cycle must be acknowledged by READY 
asserted. A halted 80376 resumes execution when 
INTR (if interrupts are enabled), NMI or RESET is 
asserted. 
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Figure 4.14. Example Halt Indication Cycle from Non-Pipelined Cycle 


SHUTDOWN INDICATION CYCLE 


The 80376 shuts down as a result of a protection 
fault while attempting to process a double fault. Sig- 
naling its entrance into the shutdown state, a shut- 
down indication cycle is performed. The shutdown 
indication cycle is identified by the state of the bus 
definition signals shown in Bus Cycle Definition 
Signals and a byte address of 0. The shutdown indi- 
cation cycle must be acknowledged by READY as- 
serted. A shutdown 80376 resumes execution when 
NMI or RESET is asserted. 


ENTERING AND EXITING HOLD 
ACKNOWLEDGE 


The bus hold acknowledge sais Th, is entered in 
response to the HOLD input being asserted. In the 
bus hold acknowledge state, the 80376 floats all 
outputs or bidirectional signals, except for HLDA. 
HLDA is asserted as long as the 80376 remains in 
the bus hold acknowledge state. In the bus hold ac- 
knowledge state, all inputs except HOLD and RE- 
SET are ignored. 
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Figure 4.15. Example Shutdown Indication Cycle from Non-Pipelined Cycle 


Th may be entered from a bus idle state as in Figure 
4.16 or after the acknowledgement of the current 
physical bus cycle if the LOCK signal is not asserted, 
as in Figures 4.17 and 4.18. - 


Th is exited in response to the HOLD input being 
negated. The following state will be T; as in Figure 
4.16 if no bus request is pending. The following bus 


state will be T1 if a bus request is internally pending, 
as in Figures 4.17 and 4.18. Tp, is exited in response 
to RESET being asserted. 


lf a rising edge occurs on the edge-triggered NMI 
input while in T;, the event is remembered as a non- 
maskable interrupt 2 and is serviced when Ty is exit- 


ed unless the 80376 is reset before Tp is exited. 
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For maximum design flexibility the 80376 has no internal pull-up resistors on its outputs. Your design may require an 
external pullup on ADS and other 80376 outputs to keep them negated during float periods. 


Figure 4.16. Requesting Hold from Idle Bus 


RESET DURING HOLD ACKNOWLEDGE 


RESET being asserted takes priority over HOLD be- 
ing asserted. If RESET is asserted while HOLD re- 
mains asserted, the 80376 drives its pins to defined 
states during reset, as in Table 4.5, Pin State Dur- 
ing Reset, and performs internal reset activity as 
usual. ; 


If HOLD remains asserted when RESET is inactive, 
the 80376 enters the hold acknowledge state before 
_ performing its first bus cycle, provided HOLD is still 
asserted when the 80376 processor would other- 
wise perform its first bus cycle. lf HOLD remains as- 
serted when RESET is inactive, the BUSY input is 
still sampled as usual to determine whether a self 
test is being requested. . 


FLOAT 


Activating the FLT input floats all 80376 bidirectional 
and output signals, including HLDA. Asserting FLT 
isolates the 80376 from the surrounding circuitry. 


When an 80376 in a PQFP surface-mount package 
is used without a socket, it cannot be removed from 
the printed circuit board. The FLT input allows the 
80376 to be electrically isolated to allow testing of 
external circuitry. This technique is known as ON- 
CE™ for “ON-Circuit Emulation”. 


ENTERING AND EXITING FLOAT 


FLT is an asynchronous, active-low input. It is recog- 
nized on the rising edge of CLK2. When recognized, 
it aborts the current bus cycle and floats the outputs 
of the 80376 (Figure 4.20). FLT must be held low for 
a minimum of 16 CLK2 cycles. Reset should be as- 
serted and held asserted until after FLT is deassert- 
ed. This will ensure that the 80376 will exit float in a 


valid state. 


Asserting the FLT input unconditionally aborts the 
current bus cycle and forces the 80376 into the 
FLOAT mode. Since activating FLT unconditionally 
forces the 80376 into FLOAT mode, the 80376 is not 
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HOLD is a synchronous input and can be asserted at any CLK2 edge, provided setup and hold fie and GA require- 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


Figure 4.17. Requesting Hold from Active Bus (NA Inactive) 


guaranteed to enter FLOAT in a valid state. After 
deactivating FLT, the 80376 is not guaranteed to 
exit FLOAT mode in a valid state. This is not a prob- 
lem as the FLT pin is meant to be used only during 
ONCE. After exiting FLOAT, the 80376 must be re- 
set to return it to a valid state. Reset should be as- 
serted before FLT is deasserted. This will ensure 
that the 80376 will exit float in a valid state. 


FLT has an internal pull-up resistor, and if it is not 
used it should be unconnected. 

' BUS ACTIVITY DURING AND Porras 
RESET 


RESET is the highest priority input signal, capable of 
interrupting any processor activity when it is assert- 


ed. A bus cycle in progress can be aborted at any 
stage, or idle states or bus hold acknowledge states 
discontinued so that the reset state is established. 


RESET should remain asserted for at least 15 CLK2 
periods to ensure it is recognized throughout the 
80376, and at least 80 CLK2 periods if a 80376 self- 
test is going to be requested at the falling edge. RE- 
SET asserted pulses less than 15 CLK2 periods may 
not be recognized. RESET pulses less than 80 CLK2 
periods followed by a self-test may cause the self- 
test to report a failure when no true failure exists. 


Provided the RESET falling edge meets setup and 
hold times tos and tog, the internal processor clock 
phase is defined at that time as illustrated by Figure 
4.19 and Figure 6.7. 
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HOLD is a synchronous input and can be asserted at any CLK2 edge, provided setup and hold (to3 and to4) require- 
ments are met. This waveform is useful for determining Hold Acknowledge latency. 


"Figure 4.18. Requesting Hold from Idle Bus (NA Active) 


An 80376 self-test may be requested at the time RE- 
SET goes inactive by having the BUSY input at a 


LOW level as shown in Figure 4.19. The self-test | 


requires (220 + approximately 60) CLK2 periods to 
complete. The self-test duration is not affected by 
the test results. Even if the self-test indicates a 


problem, the 80376 attempts to proceed with the. 
reset sequence afterwards. 


After the RESET falling edge (and after the self-test 
if it was requested) the 80376 performs an internal 
initialization euence: for approximately 350 to 450 
CLK2 periods. 
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NOTES: | 

1. BUSY should be held stable for 8 CLK2 periods before and after the CLK2 period in which RESET falling edge 
occurs. ‘ 

2. If self-test is requested, the 80376 outputs remain in their reset state as shown here. 


Figure 4.19. Bus Activity from Reset until First Code Fetch 
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Figure 4.20. Entering and Exiting FLOAT 
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4.5 Self-Test Signature 


Upon completion of self-test (if self-test was re- 
quested by driving BUSY LOW at the falling edge of 
RESET) the EAX register will contain a signature of 
OOOOOOOOH indicating the 80376 passed its self-test 
of microcode and major PLA contents with no prob- 
lems detected. The passing signature in EAX, 
OOOOOOOOH, applies to all 80376 revision levels. Any 
non-zero signature indicates the 80376 unit is faulty. 


4.6 Component and Revision 
Identifiers | 


To assist 80376 users, the 80376 after reset holds a 
component identifier and revision identifier in its DX 
register. The upper 8 bits of DX hold 33H as identifi- 
cation of the 80376 component. (The lower nibble, 


03H, refers to the Intel386™ architecture. The up- 


per nibble, 30H, refers to the third member of the 
Intel386 family). The lower 8 bits of DX hold an 
8-bit unsigned binary number related to the compo- 
nent revision level. The revision identifier will, in gen- 
eral, chronologically. track those component step- 
pings which are intended to have certain improve- 
ments or distinction from previous steppings. The 
80376 revision identifier will track that of the 80386 
where possible. 


The revision identifier is intended to assist 80376 
users to a practical extent. However, the revision 
identifier value is not guaranteed to change with ev- 
ery stepping revision, or to follow a compietely uni- 
form numerical sequence, depending on the type or 
intention of revision, or manufacturing materials re- 
quired to be changed. Intel has sole discretion over 
these characteristics of the component. 


Table 4.7. Component and 
Revision Identifier History 


80376 Stepping Name 


Revision Identifier 


4.7 Coprocessor Interfacing 


The 80376 provides an automatic interface for the 
Intel 80387SX numeric floating-point coprocessor. 
The 80387SX coprocessor uses an !/O mapped in- 
terface driven automatically by the 80376 and as- 
sisted by three dedicated signals: BUSY, ERROR 
and PEREQ. : 
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As the 80376 begins supporting a coprocessor in- 


struction, it tests the BUSY and ERROR signals to 
determine if the coprocessor can accept its next in- 
struction. Thus, the BUSY and ERROR inputs elimi- 
nate the need for any “preamble” bus cycles for 
communication between processor and coproces- 
sor. The 80387SX can be given its command op- 
code immediately. The dedicated signals provide 
instruction synchronization, and eliminate the need 
of using the 80376 WAIT opcode (9BH) for 80387SX 
instruction synchronization (the WAIT opcode was 
required when the 8086 or 8088 was used with the 
8087 coprocessor). 


Custom coprocessors can be included in 80376 
based systems by memory-mapped or |/O-mapped 
interfaces. Such coprocessor interfaces allow a 
completely custom protocol, and are not limited to a 
set of coprocessor protocol “primitives”. Instead, 
memory-mapped or |/O-mapped interfaces may use 
all applicable 80376 instructions for high-speed co- 
processor communication. The BUSY and ERROR 
inputs of the 80376 may also be used for the custom 
coprocessor interface, if such hardware assist is de- 
sired. These signals can be tested by the 80376 
WAIT opcode (9BH). The WAIT instruction will wait 
until the BUSY input is inactive (interruptable by an 
NMI or enabled INTR input), but generates an ex- 
ception 16 fault if the ERROR pin is active when the 
BUSY: goes (or is) inactive. If the custom coproces- 
sor interface is memory-mapped, protection of the 
addresses used for the interface can be provided 
with the segmentation mechanism of the 80376. If 
the custom interface is |/O-mapped, protection of 
the interface can be provided with the 80376 IOPL 
(I/O Privilege Level) mechanism. 


The 80387SX numeric coprocessor interface is |/O 
mapped as shown in Table 4.8. Note that the 
80387SX coprocessor interface addresses are be- 
yond the OH-OFFFFH range for programmed 1/O. 
When the 80376 supports the 80387SX coproces- 
sor, the 80376 automatically generates bus cycies to 


_ the coprocessor interface addresses. 


Table 4.8 Numeric Coprocessor Port Addresses 


Address in 80376 
1/0 Space 


80387SX 
Coprocessor Register 


_ Opcode Register 
Operand Register 
Operand Register 


8000F8H 
8000FCH 
8000FEH 
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SOFTWARE TESTING FOR COPROCESSOR 
PRESENCE 


When software is used to test coprocessor 

(80387SX) presence, it should use only the following 

coprocessor opcodes: FNINIT, FNSTCW_= and 

FNSTSW. To use other coprocessor opcodes when 

a coprocessor is known to be not present, first set 
EM = 1 in the 80376 CRO register. 


5.0 PACKAGE THERMAL 
SPECIFICATIONS 


The Intel 80376 embedded processor is specified 
for operation when case temperature is within the 
_range of 0°C-115°C for both the ceramic 88-pin 
PGA package and the plastic 100-pin PQFP pack- 
age. The case temperature may be measured in any 
environment, to determine whether the 80376 is 
within specified operating range. The case tempera- 
ture should be measured at the center of the top 
surface. 


The ambient temperature is guaranteed as long as 
To is not violated. The ambient temperature can be 
calculated from the @j, and 6jg from the following 
equations: 

Ty = Te + P*6ic 


Ta = Tj aa P*6ia 


To = Ta + P* [ia ae Bic] 

Values for 6jq and 6jc are given in Table 5.1 for the 
100-lead fine pitch. 6j4 is given at various airflows. 
Table 5.2 shows the maximum T, allowable (without 
exceeding T,) at various airflows. Note that Tg can 
be improved further by attaching “fins” or a “heat 
sink’’ to the package. P is calculated using the maxi- 
mum cold |g, of 305 mA and the maximum Vcc of 
5.5V for both packages. 


Table 5.1. 80376 Package Thermal 
Characteristics Thermal Resistances 
(C/Watt) dig and Gja 


200 | 400 | 600 | 800 
|(1.01)| (2.03)] (3.04) | (4.06) 


14.5 | ee oe 12.0 


100-Lead 75| 34.5} 29.5 | 25.5 |} 22.5 | 21.5 | 21.0 
iFine Pitch | 
| | i. 
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Table 5.2. 80376 
Maximum Allowable Ambient 
Temperature at Various Airflows 


6.0 ELECTRICAL SPECIFICATIONS 


The following sections describe recommended elec- 
trical connections for the 80376, and its electrical 
specifications. 


6.1 Power and Grounding 


The 80376 is implemented in CHMOS IV technology 
and has modest power requirements. However, its 
high clock frequency and 47 output buffers (address, 
data, control, and HLDA) can cause power surges 
as multiple output buffers drive new signal levels 
simultaneously. For clean on-chip power distribution 
at high frequency, 14 Vcc and 18 Vss pins separate- 

ly feed functional units of the 80376. . 


Power and ground connections must be made to all § 
external Vcc and GND pins of the 80376. On the 
circuit board, all Vcc pins should be connected on a 
Voc plane and ail Vss pins should be connected on 
a GND plane. 


POWER DECOUPLING RECOMMENDATIONS 


Liberal decoupling capacitors should be placed near 
the 80376. The 80376 driving its 24-bit address bus 
and 16-bit data bus at high frequencies can cause 
transient power surges, particularly when driving 
large capacitive loads. Low inductance capacitors 
and interconnects are recommended for best high 
frequency electrical performance. Inductance can 
be reduced by shortening circuit board traces be- 
tween the 80376 and decoupling capacitors as 
much as possible. 


RESISTOR RECOMMENDATIONS 


The ERROR, FLT and BUSY inputs have internal 
puli-up resistors of approximately 20 KQ and the 
PEREQ input has an internal pull-down resistor of 
approximately 20 K© built into the 80376 to keep 
these signals inactive when the 80387SX is not 
present in the system (or temporarily removed from 
its socket). 
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In typical designs, the external pull-up resistors 
shown in Table 6.1 are recommended. However, a 
particular design may have reason to adjust the re- 
sistor values recommended here, or alter the use of 
pull-up resistors in other ways. 


_. Table 6.1. Recommended 
Resistor Pull-Ups to Vcc 


Pin | Signal | Pull-Up Value 


16/ADS | 20K + 10% | Lightly Pull ADS 
Inactive during 80376 
7 Hold Acknowledge {| 
States 


26 | LOCK | 20 KO. + 10% | Lightly Pull LOCK | 
| Inactive during 80376 | 
| Hold Acknowledge 
| States 


OTHER CONNECTION RECOMMENDATIONS 


For reliable operation, always connect unused in- 
puts to an appropriate signal level. N/C pins should 
always remain unconnected. Connection of N/C 
pins to Vcc or Vsg will result in incompatibility 
with future steppings of the 80376. — | 


Particularly when not using interrupts or bus hold (as 
when first prototyping), prevent any chance of spuri- 
ous activity by connecting these associated inputs to 
GND: oo 


—INTR 
—NMI 
—HOLD 
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If not using address pipelining connect the NA pin to 
a pull-up resistor in the range of 20 KO to Vcc. 


6.2 Absolute Maximum Ratings 
Table 6.2. Maximum Ratings 


C 
Case Temperature —65°C to + 120°C 
under Bias | 
Supply Voltage with —0.5V to +6.5V 
Respect to Vss 3 . 
Voltage on Other Pins | —0.5V to (Vcc + 0.5)V | 


Table 6.2 gives a stress ratings only, and functional 
operation at the maximums is not guaranteed. Func- 
tional operating conditions are given in Section 6.3, 
D.C. Specifications, and Section 6.4, A.C. Specifi- 
cations. i 


Extended exposure to the Maximum Ratings may af- 
fect device reliability. Furthermore, although the 
80376 contains protective circuitry to resist damage 
from static electric discharge, always take precau- 
tions to avoid high static voltages or electric fields. 
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6.3 D.C. Specifications 


ADVANCE INFORMATION SUBJECT TO CHANGE 


Table 6.3: 80376 D.C. Characteristics 
Functional Operating Range: Vcc = 5V +10%; Tcase = O0°C to 115°C for 88-pin PGA or 100-pin PQFP 


Vin finput HIGH Vottage Yoo +08 [vi 


+0.8 V1) 
Voc + 0.3)V0) 


| 0.45 


VIHC Voc — 0.8 


lo. = 4 mA: Aoa-A1, Dy5-Do | 


| BHE, BLE, W/R, | 
D/C, M/TO, LOCK, 
ADS, HLDA 


if 
= 
'@) 
onu®) 

fr ; ff 7 
ma | A 

N | NO 
35/15 

i © Tae ae @ } 
cic 
ae | -—- 

Tj T_T 

5 |O 
|= 

<= 

S\o 

= | ot 
21a 

Oo | 

| 

) 

GS 


< 
~~ 

Ss 
wT 


lon = —0.2 mA: 


lon = —0.9 mA: 


A23-A1, Dis-Do 


BHE, BLE, W/R, 
D/C, M/1O, LOCK, 
ADS, HLDA 


lon = —0.18 mA: | BHE, BLE, W/R, 
D/C, M/IO, LOCK 
ADS, HLDA | 
Input Leakage Current 


(For All Pins except 
PEREQ, BUSY, FLT and ERROR) 


Input Leakage Current 
(PEREQ Pin) 
Supply Current 
CLK2 = 32 MHz — 
| CLK2 = 40 MHz 


NH 

Ne | 

lo 

loc 

Cin Input Capacitance 
Output or \/ O Capacitance 
Cok CLK2 Capacitance 


NOTES: : . 
1. Tested at the minimum operating frequency of the device. 

2. PEREQ input has an internal pull-down resistor. 

3. BUSY, FLT and ERROR inputs each have an internal pull-up resistor. 

4. Ico max measurement at worse case load, Vcc and temperature (0°C). 

' 5. Not 100% tested. 


— Z 
£15 


rh 
© 
(>) 


vA, Vi = 2.4V(1, 2) 
| 400 | pA, Vi, = 0.45V08) 


415 BA, 0.45V < Vout < Vec™) 


THI > 
we 
n & 
mas 
3B 
me 
pj oO 
ao 
913 
va 
A 


a NM 
On 
aon 


mA, Ic typ = 175 mA(4) 


mA, Icc typ = 200 mA(4) 
10 pF, Fo = 1 MHz(5) 

12 pF, Fo = 1 MHz) 
20 pF, Fo = 1 MHz(5) 
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The A.C. specifications given in Table 6.4 consist of _ 


output delays, input setup requirements and input 
hold requirements. All A.C. specifications are rela- 
tive to the CLK2 rising edge crossing the 2.0V level. 


A.C. specification measurement is defined by Figure 
6.1. Inputs must be driven to the voltage levels indi- 
cated by Figure 6.1 when A.C. specifications are 
measured. 80376 output delays are specified with 
minimum and maximum limits measured as shown. 
The minimum 80376 delay times are hold times pro- 
vided to external circuitry. 80376 input setup and 
hold times are specified as minimums, defining the 


CLK2 [ 2V — 


NOTE 2 
OUTPUTS 


(At ~A23,BHE,BLE, VALID 
ADS,M/10,D/C, OUTPUT n 
W/R,LOCK,HLDA) 


OUTPUTS 
(DO=D15) 


____ INPUTS 
(NA,INTR,NMI) 


INPUTS 
(READY,HOLD,| 
ERROR, BUSY, 


PEREQ,DO=D15) 


LEGEND: 

A—Maximum Output Delay Spec. 
B—Minimum Output Delay Spec. 
C—Minimum Input Setup Spec. 
D—Minimum Input Hold Spec. 
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smallest acceptable sampling window. Within the 
sampling window, a synchronous input signal must 
be stable for. correct 80376 processor operation. 


Outputs NA, W/R, D/C, M/IO, LOCK, BHE, BLE, 
A23-A; and HLDA only change at the beginning of 
phase one. D15-Dp (write cycles) only change at the 
beginning of phase two. The READY, HOLD, BUSY, 
ERROR, PEREQ and Dy;5-Do (read cycles) inputs 
_are sampled at the beginning of phase one. The NA, 
INTR and NMI inputs are sampled at the beginning 
of phase two. 


VALID 
OUTPUT n+1 


NOTE 2 MIN 


VALID MRR sy _ VALID 
OUTPUT n ; LRN \ OUTPUT n+1 


NOTE 1 


VALID 


1.5V 
INPUT A 


240182-35 


Figure 6.1. Drive Levels and Measurement Points for A.C. Specifications 
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6.4 A.C. Specifications 


Table 6.4. 80376 A.C. Characteristics at 16 MHz 
Functional Operating Range: Vcc = 5V +10%; Tcase = O0°C to 115°C for 88-pin PGA or 100-pin PQFP 


[Parameter | Min 
reueien tne 
ouxstowtine 8 
ouKaraitime | 
[etka riseTime 

Pa 


Asa= As Valid Delay 


| Azg~A1 Float Delay fa 
BHE,BLE,LOCK ~~ 

Valid Delay - . 

BHE, BLE, LOCK 

Float Delay . 

W/R, M/IO, D/C, 
ADS Valid Delay 


W/R, M/IO, D/C, 
ADS Float Delay 
D45—-Dpo Write Data 
Valid Delay 
D45—-Dp Write Data 
Float saa 


Col ae 
[Rksouptime «dts 
[WkHodTime a 
READY SoupTime | 19 
[Setup Tne Oy5-Do Read Gata [9 
Hols Tine Dye-Op Reed Dela | 6 
[HouDHeTine |S 
4 


RESET Hold Time 


Notes 
Half CLK2 Freq 


—s 


a 
N 
°?) 
re) 


At 2(8) 

At (Voc — 0.8)V(3) 

At 2V03) 

At 0.8V(3) 

(Vcc — 0.8)V to 0.8V(8) | 

0.8V to (Vcc — 0. ie 
= 120 pF(4) 


C. = 75 pF(4) 
CL = 75 pF(4) 
CL = 120 pF(4) 


C. = 75 pF(4) 


hi;jt 
Go 


+ 
Sh 


8 


~~ 
i<o} 
pa 


[om al 
— 
oO 


2 


Oo) 
‘in 


. ioe) {ee) Ss oe) 
-) oO G io) o>) 


PN 


n> | on 


_— 
<e) 


—h, 
alg 


6.7 
6.7 


i) Gd 
a on 


D|HD|H| 
pA AL RR 


=r 


26 
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7 Table 6.4. 80376 A.C. Characteristics at 16 MHz (Continued) | 
Functional Operating Range: Vcc = 5V +10%; Tcase = O0°C to 115°C for 88-pin PGA or 100-pin PQFP 


to7 | NMI, INTR Setup Time 
tog NMI, INTR Hold Time 


PEREQ, ERROR, BUSY, FLT 
Setup Time 
PEREQ, ERROR, BUS 

| Hold Time 


NOTES: | | 
1. Float condition occurs when maximum output current becomes less than lio in magnitude. Float delay is not 100% 
tested. 


5. The 80376 does not have ty7 or tg timing specifications. ; 


Tabie 6.5. 80376 A.C. Characteristics at 20 MHz 
Functional Operating Range: Vcg = 5V +10%; Tcase = 0°C to 115°C for 88-pin PGA or 100-pin PQFP 


Symbol Parameter Notes 
Operating Frequency a a 


Min 
4 | 20 Half CLK2 Frequency 
CLK2 Period 25 | 63 | 
8 
5 
4 30 
4 


MHz 


Oo 
ig) 


in| 
x 
Pc 
ae 
ft . 
aa 


ty 
tga . | 


BHE, BLE, LOCK 

Valid Delay 

tg BHE, BLE, LOCK — 4 
Float Delay 


6.3 
6.3 
6.3 


(Voc — 0.8V) to 0.8V(3) 
0.8V to (Voc—0.8)(3) 
CL = 120pr4) 
(1) 

CL = 75 pFl4) 


ty 
ts 
te 


o 
on 


t7 
tg 


5 


(1) 


” 


5 
a 
© 
on 


CL = 75 pF(4) 


Fam al 
— 
© 
» 


‘| M/IO, D/C 
Valid Delay 

trob | W/R, ADS 
Valid Delay 


ti; | W/R,M/0, D/C, 


ADS Float Delay 


D45-—Do Write Data 
Valid Delay | 


Dy5—-Do Write Data 4 27 
Float Delay | 
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3 
2 
2 


OL = 75 pF(4) 


2 
8 
6 
30 


(1) 


6.6 


Gq 


ty2 CL=120pF 


ae) 


t13 6.6 | (1) 
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Table 6.5. 80376 A.C. Characteristics at 20 MHz (Continued) 
Functional Operating Range: Vcc = 5V £10%; Tcase = 0°C to 115°C for 88-pin PGA or 100-pin PQFP 


Pig | NRSoup tine 
ree cc 
Tie | REABY Setup Tine 
iso | RERBY Hold Time 
ae 
a 
a 


or) 
iN 


4 
5 
12 
i 4 
D15-Do Read Data Setup Time 


D45-Dp Read Data Hold Time 
HOLD Setup Time 


uk = 
N NO 


‘ Od OP: OPS | 0? | OPS? 1? > |.) OF 
BR} BIATRININIR [RRR RPT A 


pe PEREQ, ERROR, BUSY, FLT (2) 

Setup Time 

6 PEREQ, ERROR, BUSY, FLT ns 6 (2) 
Hold Time | 


NOTES: 

1. Float condition occurs when maximum output current becomes less than I_o in magnitude. Float delay is not 100% 
tested. : 

2.:These inputs are allowed to be asynchronous to CLK2. The setup and hold specifications are given for testing purposes, 
to assure recognition within a specific CLK2 period. 

3. These are not tested. They are guaranteed by design characterization. 

4. Tested with C, set to 50 pF and derated to support the indicated distributed capacitive load. See Figures 6.8 through 6.10 
for capacitive derating curves. 

5. The 80376 does not have ty7 or tig timing specifications. 


A.C. TEST LOADS | A.C. TIMING WAVEFORMS 


80376 
OUTPUT 


240182-36 


240182-37 


Figure 6.2. A.C. Test Loads Figure 6.3. CLK2 Waveform 
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| ia 62 | Py 
CLK2 [ «- y \ | cs 8 
ee 
R09) | 
Oe | 
(wer) L NOOK TORS 
sit +B 
_ (8) 


ee ee ers a 
one f WK 


240182-38 


_ Figure 6.4. A.C. Timing Waveforms—input Setup and Hold Timing 


Tx 
G2 $1 #2 | @t 


cuz | _— | 7 
BHE, BLE, ae aa Nee — a 
coek | VALI n XQ VAL net 
W/R, M/i0 | ec MIN Max 
D/é, ADS [ _VALIO n KAY VALID n#t 
| Or fain [| “| Max 
: @ MIN MAX 
(oureuT) - TED AW 2 


HLDA I 
240182-39 


Figure 6.5. A.C. Timing Waveforms—Output Valid Delay Timing 
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Ti OR T1 
CLK2 [ 


BHE, BLE, 
LOCK 


W/R, M/I0, 
D/C, ADS 


A1-A23 [ 


DO-D15 


(HIGH Z) 


(13) ALSO APPLIES TO DATA FLOAT WHEN WRITE 
CYCLE IS FOLLOWED BY READ OR IDLE 


MAX Cx am 


X 


Ohm eg MIN ee MAX 


240182-40 


Figure 6.6. A.C. Timing Waveforms—Output Float Delay and HLDA Valid Delay Timing 


<———-— RESET ———_> INITIALIZATION SEQUENCE 
$2 OR 61 $2 OR G1 $2 |: $1 


| 240182-41 
The second internal processor phase following RESET high-to-low transition (provided tes and tog are met) is O2. 


Figure 6.7. A.C. Timing Waveforms—RESET Setup and Hold Timing, and Internal Phase 
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OUTPUT VALID DELAY (ns) : 


os 
a” 
c 
we, 
ci 
lid 
a 
2 
a 
< 
> 
asl 
~ 
a. 
| ood 
_ 
fe) 


75 100 125 150 7S 100 


C. Gisciarada -C; (picofarads) 
. 240182-42 | 240182-43 
Figure 6.8. Typical Output Valid Delay versus Figure 6.9. Typical Output Valid Delay versus 
Load Capacitance at Maximum Operating Load Capacitance at Maximum Operating 
Temperature (C, = 120 pF) Temperature (C_ = 75 pF) 


RISE TIME (ns) 0.8V = 2.0V 


CL (picofarads) 
oo 240182-44 


Figure 6.10. Typical Output Rise 
Time versus Load Capacitance at 
Maximum Operating Temperature 


FREQUENCY (MHz) 


240182-45 


Figure 6.11. Typical Icc vs Frequency 
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6.5 Designing for the ICETM-376 
Emulator 


The 376 embedded processor in-circuit emulator 
product is the ICE-376 emulator. Use of the emula- 
tor requires the target system to provide a socket 
that is compatible with the ICE-376 emulator. The 
80376 offers two different probes for emulating user 
systems: an 88-pin PGA probe and a 100-pin fine 
pitch flat-pack probe. The 100-pin fine pitch flat- 
pack probe requires a socket, called the 100-pin 
PQFP, which is available from 3-M Textool (part 
number 2-0100-07243-000). The ICE-376 emulator 
probe attaches to the target system via an adapter 
which replaces the 80376 component in the target 
system. Because of the high operating frequency of 
80376 systems and of the ICE-376 emulator, there is 
no buffering between the 80376 emulation proces- 
sor in the ICE-376 emulator probe and the target 
system. A direct result of the non-buffered. intercon- 
nect is that the ICE-376 emulator shares the ad- 
dress and data bus with the user’s system, and the 
RESET signal is intercepted by the ICE emulator 
hardware. In order for the ICE-376 emulator to be 
functional in the user’s system without the Optional 
Isolation Board (OIB) the designer must be aware of 
the following conditions: 


1. The bus controller must only enable data trans- 
ceivers onto the data bus during valid read cycles 
of the 80376, other local devices or other bus 
masters. 


2. Before another bus master drives the local proc- 
essor address bus, the other master must gain 
control of the address bus by asserting HOLD and 
receiving the HLDA response. 
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3. The emulation processor receives the RESET sig- 
nal 2 or 4 CLK2 cycles later than an 80376 would, 
and responds to RESET later. Correct phase of 
the response is guaranteed. 


In addition to the above considerations, the ICE-376 
emulator processor module has several electrical 
and mechanical characteristics that should be taken 
into consideration when designing the 80376 sys- 
tem. 


- Capacitive Loading: ICE-376 adds up to 27 pF to 


each 80376 signal. 


Drive Requirements: ICE-376 adds one FAST TTL 
load on the CLK2, control, address, and data lines. 
These loads are within the processor module and 
are driven by the 80376 emulation processor, which 
has standard drive and loading capability listed in 
Tables 6.3 and 6.4. | 


Power Requirements: For noise immunity and 


CMOS latch-up protection the ICE-376 emulator 
processor module is powered by the user system. 
The circuitry on the processor module draws up to 
1.4A including the maximum 80376 Icc from the 
user 80376 socket. 


80376 Location and Orientation: The ICE-376 em- 
ulator processor module may require lateral. clear- 
ance. Figure 6.12 shows the clearance requirements 
of the iMP adapter and Figure 6.13 shows the clear- | 
ance requirements of the 88-pin PGA adapter. The 


240182-46 
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240182-50 


Figure 6.13. ICET-376 Emulator User Cable with 88-Pin PGA Adapter 


optional isolation board (OIB), which provides extra 
electrical buffering and has the same lateral clear- 
ance requirements as Figures 6.12 and 6.13, adds 
an additional 0.5 inches to the vertical clearance re- 
quirement. This is illustrated in Figure 6.14. 


Optional Isolation Board (OIB) and the CLK2 
- speed reduction: Due to the unbuffered probe de- 
sign, the ICE-376 emulator is susceptible to errors 


Figure 6.1 4. ICETM-376 Emulator User Cable with OIB and PQFP Adapter 


on the user’s bus. The OIB allows the ICE-376 emu- 
lator to function in user systems with faults (shorted 
signals, etc.). After electrical verification the OIB 
may be removed. When the OIB is installed, the user 
system must have a maximum CLK2 frequency of 20 
MHz. | 


240182-51 


5-1286 


ntl 


7.0 DIFFERENCES BETWEEN THE 
~ 80376 AND THE 80386 | 


The following are the major differences between the 
80376 and the 80386. 


1. The 80376 generates byte selects on BHE and 
BLE (like the 8086 and 80286 microprocessors) 
to distinguish the upper and lower bytes on its 
16-bit data bus. The 80386 uses four-byte selects, 
BEO-BE3, to distinguish between the different 
bytes on its 32-bit bus. 


2. The 80376 has no bus sizing option. The 80386 
can select between either a 32-bit bus or a 16-bit 
bus by use of the BS16 input. The 80376 has a 
16-bit bus size. . 


3. The NA pin operation in the 80376 is identical to 
that of the NA pin on the 80386 with one excep- 
tion: the NA pin of the 80386 cannot be activated 
on 16-bit bus cycles (where BS16 is LOW in the 
80386 case), whereas NA can be activated on 
any 80376 bus cycle. 


4. The contents of all 80376 registers at reset are 
identical to the contents of the 80386 registers at 
reset, except the DX register. The DX register 
contains a component-stepping identifier at reset, 
i.e. 


in 80386, after reset DH = 03H indicates 80386 | 


DL = revision number; 
in 80376, after reset DH = 33H indicates 80376 
DL = revision number. 


5. The 80386 uses A3; and M/IO as a select for 
numerics coprocessor. The 80376 uses the 
Ao3 and M/IO to select its numerics coproces- 
sor. 


6. The 80386 prefetch unit fetches code in four- 
byte units. The 80376 prefetch unit reads two 
bytes as one unit (like the 80286 microproces- 
sor). In BS16 mode, the 80386 takes two con- 
secutive bus cycles to complete a prefetch re- 
quest. If there is a data read or write request 
after the prefetch starts, the 80386 will fetch 
all four bytes before addressing the new re- 
quest. 
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7. The 80376 has no paging mechanism. 


8. The 80376 starts executing code in what corre- 
sponds to the 80386 protected mode. The 80386 
starts execution in real mode, which is then used 
to enter protected mode. 


9. The 80386 has a virtual-86 mode that allows the 
execution of a real mode 8086 program as a task 
in protected mode. The. 80376 has ‘no virtual-86 
mode. 


10. The 80386 maps a 48-bit logical address into a 
32-bit physical address by segmentation and 
paging. The 80376 maps its 48-bit logical ad- 
dress into a 24-bit physical address by segmen- 
tation only. 


11. The 80376 uses the 80387SX numerics coproc- 
essor for floating point operations, while the 
80386 uses the 80387 coprocessor. 


12. The 80386 can execute from 16-bit code seg- 
ments. The 80376 can only execute from 32-bit 
code Segments. 


13. The 80376 has an input called FLT which three- 
states all bidirectional and output pins, including 
HLDA, when asserted. It is used with ON Circuit 
Emulation (ONCE). 


8.0 INSTRUCTION SET 


This section describes the 376 embedded processor 
instruction set. Table 8.1 lists all instructions along 
with instruction encoding diagrams and. clock 
counts. Further details of the instruction encoding 
are then provided in the following sections, which 
completely describe the encoding structure and the 
definition of all fields occurring within 80376 instruc- 
tions. 


8.1 80376 Instruction Encoding and 
Clock Count Summary 


To calculate elapsed time for an instruction, multiply 
the instruction clock count, as listed in Table 8.1 be- 
low, by the processor clock period (e.g. 50 ns foran | 
80376 operating at 20 MHz). The actual clock count . 
of an 80376 program will average 10% more 
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than the calculated clock count due to instruction 
sequences which execute. faster than they can be 
fetched from memory. 


instruction Clock Count Assumptions: 


1. The instruction has been prefetched, decoded, 
_and is ready for execution. 


2. Bus cycles do not require wait states. 


3. There are no local bus HOLD bequests delaying 
processor acess to the bus. 


4. No exceptions are detected during instruction ex- 
ecution. | 


5. If an effective address is eaiculaisa: it does not 
use two general register components. One regis- 
ter, scaling and displacement can be used within 
the clock counts showns. However, if the effec- 
tive address calculation uses two general register 
components, add 1 clock to the clock count 
shown. 


6. Memory reference instruction accesses byte or 


aligned 16-bit operands. 


. Instruction Clock Count Notation 


— If two clock counts are given, the smaller refers to 
a register operand and the larger refers to a 
memory operand. 
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—n = number of times repeated. 


—m = number of components in the next instruc- 
tion executed, where the entire displacement (if 
any) counts as one component, the entire im- 
mediate data (if any) counts as one component, 
and all other bytes of the instruction and pre- 
fix(es) each count as one component. 


Misaligned or 32-Bit Operand Accesses: 


— If instructions accesses a misaligned 16-bit oper- 
and or 32-bit operand on even address. add: 


2* clocks for read or write. : 
4** clocks for read and write. 


— If instructions accesses a 32-bit operand on odd 
address add: 


4* clocks for read or write. 
8** clocks for read and write. 


Wait States: 


Wait states add 1 clock per wait state to instruction 
execution for each data access. | 
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Table 8.1. 80376 Instruction Set Clock Count Summary 


Instruction Format 


GENERAL DATA TRANSFER 
MOV = Move: 


Register to Register/Memory 1000100w | modreg r/m 
Register/Memory to Register 1000101w {| modreg r/m 


Immediate to Register/Memory 1100011w | mod000_ r/m| immediate data 


Immediate to Register (Short Form) 1011w reg | immediate data 
Memory to Accumulator (Short Form) 1010000w | fulldisplacement 
Accumulator to Memory (Short Form) 1010001w full displacement 


Register/Memory to Segment Register 10001110 | modsreg3 r/m 


Segment Register to Register/Memory 10001100 {| modsreg3 r/m 
MOVSX = Move with Sign Extension 
Register from Register/Memory 00001111 


MOVZX = Move with Zero Extension 


toritiiw 
Register from Register/Memory 00001111 1011011w 
PUSH = Push: 


Register/Memory 11111111 |mod110 = r/m 


Register (Short Form) 01010 reg 
Segment Register (ES, CS, SS or DS) 


000sreg2110 
Segment Register (FS or GS) 


00001111 10 sreg3 000 


Immediate | 011010s0 immediate data 
PUSHA = Push All 01100000 

POP = Pop 
Register/Memory 10001111 mod000 r/m 


Register (Short Form) 01011 reg 
Segment Register (ES, SS or DS) 


000sreg2111 
Segment Register (FS or GS) 00001111 


10sreg3001 


POPA = Pop All 01100001 
XCHG = Exchange 


Register/Memory with Register 1000011WwW | modreg r/m 


Register with Accumulator (Short Form) | 10010 ~— reg 


IN = Input from: 


Fixed Port 1110010w port number 


Variable Port 1110110w 


OUT = Output to: 


Fixed Port 1110011w port number 


Variable Port 11101171w 


LEA = Load EA to Register 10001101 | modreg r/m 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


7 P aieh ened Number 
Instruction =’ ' Format poi of Data Notes 
ae . / ; Cycles ; 


SEGMENT CONTROL 


LDS = Load Pointer to DS 11000101 


LES = Load Pointer to ES 11000100 mod reg r/m 


LFS = Load Pointer to FS 00001111 | 10110100 
LGS = Load Pointer to GS 00001111 | 10110101 


‘ 


LSS = Load Pointer to SS 00001111 | 10110010 | modreg = r/m 


FLAG CONTROL 

CLC = Clear Carry Flag 
CLb = Clear Direction Flag 
CLI = Clear interrupt Enable Flag 


CLTS = Clear Task Switched Flag 00001111 00000110 
CMC = Complement Carry Flag. 11110101 |- g 


LAHF = Load AH into Flag 

POPF = Pop Flags | | 10011101 | 

PUSHF = Push Flags . | 10011100 | 

SAHF = Store AH into Flags © 

STC = Set Carry Flag | 

STD = Set Direction Flag | 11111101 | 

STi = Set Interrupt Enable Flag 

ARITHMETIC: 

ADD = Add 

Register to Register 


Register to Memory | 0000000w 
Memory to Régister 0000001w | modreg r/m 


Immediate to Register/ Memory 100000sw | mod 00 0 r/mi immediate data 


Immediate to Accumulator (Short Form) 0000010Ww immediate data 


ADC = Add with Carry 


Register to Register : 000100dw |modreg _r/m| 
Register to Memory = 0001000w 


Memory to Register 0001001w {| modreg r/m 


Immediate to Register/ Memory 100000sw | mod010_ r/m| immediate data 
Immediate to-Accumulator (Short Form) 0001010w immediate data 


INC = Increment 


Register/Memory - | 11111 1 tw | mod000 r/m 


Register (Short Form) 01000 ~~ reg 


SUB = Subtract 


Register from Register - | 001010dw 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock 


Instruction Format Counts 


ARITHMETIC (Continued) 


Register from Memory 0010100w |modreg r/m 


| 


Memory from Register 0010101w {mod reg r/m 


Immediate from Register/Memory 100000sw |mod101_ r/m} immediate data 


Immediate from Accumulator (Short Form) 0010110w immediate data 


SBB = Subtract with Borrow 


Register from Register | 000110dw |mod reg r/m 


Register from Memory 0001100w |modreg r/m 


Memory from Register 0001101w jmodreg r/m 


Immediate from Register/Memory 100000sw |mod011_ r/mj immediate data 


Immediate from Accumulator (Short Form) 0001110w immediate data 
DEC = Decrement 


Register/Memory 1111111w /regO001 = r/m 


Register (Short Form) 01004 reg 


CMP = Compare 


Register with Register 001110dw |modreg r/m 


Memory with Register /0011100w {modreg _r/m 


Register with Memory 0011101w |modreg r/ 


Immediate with Register/Memory 100000sw 


Loorri iow 


immediate with Accumulator (Short Form) 0011110w immediate data 


00110111 


f 


mod1i1  r/mj immediate data 


NEG = Change Sign 1111011w {mod011_ r/m 


AAA = ASCII Adjust for Add 


AAS = ASCIi Adjust for Subtract 007111171 


00100111 


DAA = Decimal Adjust for Add 


DAS = Decimal Adjust forSubtract = | 00101111 


MUL = Multiply (Unsigned) 


Accumulator with Register/Memory 1111011w |mod100 r/m 
Multiplier—Byte 12-17/15-20 


—Word 
—Doubleword 


IMUL = Integer Multiply (Signed) 


Accumulator with Register/Memory 1111011w |mod101 = r/m 


12-25/15-28* 
12-41/17-46* 


Multiplier—Byte 12-17/15-20 
—Word 12~25/15-28* 
—Doubleword 12-41/17-46* 

Register with Register/Memory 00001111 | 10101111 |modreg r/m 

| Multiplier-—Byte 12~17/15~-20 
—Word 12-25/15-28* 
—Doubleword 12-41/17-46* 

Register/Memory with Immediate to Register} 011010s1 immediate data 
—Word 13~-26/14-27* 
—Doubleword 13~42/16-45* 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock Number’ 
Counts Of Data I 
Cycles. 


Instruction , Format 


ARITHMETIC (Continued) 
DIV = Divide (Unsigned) 


Accumulator by Register/Memory 111101141wImod110 = r/m 


Divisor—Byte 
~-Word 
—Doubleword 


IDIV = Integer Divide (Signed) 


Accumulator by Register/Memory 11110114w Jmod111 = r/m 


Divisor—Byte 
—Word 
—Doubleword 


AAD = ASCII Adjust for Divide 11010101 


AAM = ASCII Adjust for Multiply 11010100 | 00001010 


CBW = Convert Byte to Word 10011000 | 


10011001 


CWD = Convert Word to Double Word 


LOGIC 


Shift Rotate Instructions 
_ [Not Through Carry (ROL, ROR, SAL, SAR, SHL, and SHR) 


Register/Memory by 1 1101000w 


Register/Memory by CL 1101001w {[modTTT r/ 


Register/Memory by Immediate Count | 1100000w immed 8-bit data 


Through Carry (RCL and RCR) 


Register/Memory by 1 1101000wW |modTTT r/m 
Register/Memory by CL 1101001w 


mod TTT  r/mijimmed 8-bit data 
TTT Instruction 


9/10** 0/2** 


9/10** 10/2** 


9/10** 0/2** 


Register/Memory by Immediate Count | 1100000w 


000 ROL 
001 ROR 
010 ACL 
011  RCR 
100 SHL/SAL 
101 SHR 
111 SAR 


SHLD = Shift Left Double 
Register/Memory by Immediate 00001111 


10100100 immed 8-bit data 


10100101 |modreg r/m 


10101100 immed 8-bit data 


Register/Memory by CL 00001111 | 10101101 


Register/ Memory by CL 00001111 


SHRD = Shift Right Double 
Register/Memory by Immediate 00001111 


AND = And 


Register to Register 


001000dw |mod reg r/m 
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Table 8.1. 80376 Instruction Set Clock Count Summar 


instruction 


LOGIC (Continued) 

Register to Memory 

Memory to Register 

Immediate to Register/Memory 
Immediate to Accumulator (Short Form) 


TEST = And Function to Flags, No Result 
Register/Memory and Register . 


Immediate Data and Register/Memory 


Immediate Data and Accumulator 
(Short Form) 


OR = Or 


Register to Register 
Register to Memory 
'|Memory to Register 
Immediate to Register/ Memory 


Immediate to Accumulator (Short Form) 
XOR = Exclusive Or 
Register to Register 


Register to Memory 
Memory to Register 
immediate to Register/Memory 


Immediate to Accumulator (Short Form) 


NOT = Invert Register/Memory 

STRING MANIPULATION | 

CMPS = Compare Byte Word | 

INS = Input Byte/Word from DX Port 

|LODS = Load Byte/Word to AL/AX/EAX 

MOVS = Move Byte Word 

OUTS = Output Byte/Word to DX Port 

SCAS = Scan Byte Word 

STOS = Store Byte/Word from 
AL/AX/EX 

XLAT = Translate String 


REPEATED STRING MANIPULATION 
Repeated by Count in.CX or ECX ° 
REPE CMPS = Compare String 

(Find Non-Match) 
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(Continued) 


Clock 
Counts 


Number 
of Data 
Cycies 


Format 


0010000w 


mod reg r/m 


0010001w |modreg r/m 


1000000wW |mod100_ r/m| immediate data 


0010010w | immediate data 


1000010w | mod reg r/m 


1111011w |mod000_ r/m| immediate data 


1010100w | immediate data 


000010dw |modreg ss r/m 


0000100w |modreg r/m 


0000101w |modreg r/m 


1000000w |mod001_ r/m| immediate data 


0000110w | immediate data 


| 


001100dw |modreg- r/m 


0011000w |modreg = r/m 


0011001w |modreg r/m 


1000000w |mod110_ r/m| immediate data 


| 


0011010w | immediate data 


1111011w {mod010 = r/m 
1010011w 
0110110w 
1010110WwW 
1010010w 
.0110111W 


1010111Ww 


1010101WwW 


11010111 


1010011w 


11110011. 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 
Instruction : a Format 


REPEATED STRING MANIPULATION (Continued) 
REPNE CMPS = Compare String . 
(Find Match) i 11110010 | 1010011w . 5 + on** 


7 7 + 6n* 

NS = Input Stri 11110011 011011 . 
REP INS pu String 0 ; | Ow. . | | 97 + 6n* 
REP LODS = Load String 11110011 | 1010110w ; 5 + 6n* 


REP MOVS = Move String 11110011 


1010010w a | 7 + 4n** 
‘6+ 5n* 


EP OUTS = Output Stri 11110011 110111 
REP OUTS = Output String Ae 0 1 w 26 + 5n* 


REPE SCAS = Scan String | | 
(Find Non-AL/AX/EAX) 11110011 | 1010111w] | 5 + 8n* 


REPNE SCAS = Scan String 
(Find AL/AX/EAX) 17110010 | 1010111Ww 


REP STOS = Store String 111100711 | 1010101Ww 
BIT MANIPULATION 


BSF = Scan Bit Forward 00001111 | 10111100 |modreg r/m|. ; ~ 10 + 3n** 


BSR = Scan Bit Reverse 00001111 


mod reg r/m 10 + 3n** 


10111101 
BT = Test Bit 
Register/Memory, Immediate 00001111 


10111010 }|mod100 = r/m{immed 8-bit data 


Register/Memory, Register 00 001 111 | 10100011 |modreg = = r/m 


BTC = Test Bit and Complement 


140111010 |mod111_ r/m|immed 8-bit data 


Register/Memory, Register 00001 111; 10111011 |modreg © r/m 


10111010 |mod110 = r/m{immed 8-bit data 


Register/Memory, Register 00001111 10110011 |modreg r/m 


mod101. r/mj immed 8-bit data] - 


10101011 |modreg r/m 


Registe?/Memory, Immediate | 00001111 


BTR = Test Bit and Reset 


Register/Memory, Immediate 00001111 


BTS = Test Bit and Set 


Register/Memory, Immediate 00001111 10111010 


Register/Memory, Register 1 00001111 


CONTROL TRANSFER 
CALL = Cali 
Direct within Segment ~ 1 41101000 | fulldisplacement 


Register/Memory 


Indirect within Segment 11111111 {mod010 r/m . 9+m/12+m 


Direct Intersegment 10011010 | unsigned full offset, selector . . . 42+m 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


instruction Format 


CONTROL TRANSFER (Continued) 
(Direct Intersegment) 


Via Call Gate to Same Privilege Level 

Via Call Gate to Different Privilege Level, 
(No Parameters) 

Via Call Gate to Different Privilege Level, 
(x Parameters) 


From 386 Task to 386 TSS 


Indirect Intersegment 11111111 


Via Call Gate to Same Privilege Level 

Via Call Gate to Different Privilege Level, 
(No Parameters) 

Via Call Gate to Different Privilege Level, 
(x Parameters) 


From 386 Task to 386 TSS 


JMP = Unconditional Jump 


Clock 
Counts 


64+m 
98 +m 
106 + 8x +m 


392 


mod011 — t/m 


68 + m 


102 +m 


110 + 8x +m 


399 


Short 11101011 | 8-bit displacement 7+m 


Direct within Segment 11101001 


full displacement 7+m 


Register/Memory Indirect within Segment | 11111111 |mod100 r/m 9+m/14+m 


Direct Intersegment 11101010 | unsigned full offset, selector 37 +m 


Via Call Gate to Same Privilege Level 


From 386 Task to 386 TSS 


Indirect Intersegment 11111111 |mod101 r/m 


Via Call Gate to Same Privilege Level 
From 386 Task to 386 TSS 
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a,c,d,j 
a,c,d,j 
a,c,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 


¢,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 


a,c,d,j 
a,c,d,j 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


7 Number 
Instruction Format pane of Data Notes | 
= Caer Cycles 


CONTROL TRANSFER (Continued) 
RET = Return from CALL: 


Within Segment 11000011 12+m. -  &hP. 


Within Segment Adding Immediate toSP | 11000010 16-bit disp] | | {2+ | a,j,p 
Intersegment | 11001011 . 36 +m . a,¢,d,j,p 


Intersegment Adding immediate to SP 11001010 16-bit displ 36 +m ; a,c,d,j,p 
to Different Privilege Level | 
intersegment 80 : : c,d,j,p 
Intersegment Adding immediate to SP 80 | c,d,j,p 
CONDITIONAL JUMPS 


NOTE: Times Are Jump “Taken or Not Taken” 
JO = Jump on Overflow 


8-Bit Displacement 1 01110000 8-bit disp! 7+mor3 
Full Displacement 000011114 | 10000000 | full displacement 7+ mor3 


JNO = Jump on Not Overfiow 


8-Bit Displacement 01110001 8-bit displ ; 7+mor3 | 


Full Displacement . | 00001111 | 10000001 | fulldisplacement 7+ mor3 


JB/JNAE = Jump on Below/Not Above or Equal . , 
_8-Bit Displacement | 01110010 | — 8-bitdispl f° 7+ mor3, 


Full Displacement 00001111 10000010 | full displacement 7+mor3 


| 


JNB/JAE = Jump on Not Beiow/Above or Equal 
8-Bit Displacement 01110011 8-bit displ 7+ mor3 


Full Displacement 00001111 | 10000011 | full displacement 7+mors3 


JE/JZ = Jump on Equal/Zero 


8-Bit Displacement 01110100 8-bit disp! | | | 7+mor3 


Full Displacement 00001111 10000100 | full displacement 7+mor3 


JNE/JNZ = Jump on Not Equai/Not Zero 
8-Bit Displacement 01110101 8-bit disp! 7+mor3 


Full Displacement 00001111 0000101 | full displacement 7+mor3 


JBE/JNA = Jump on Below or Equal/Not Above = 
8-Bit Displacement 01110110 8-bit displ 7+mor3 


Full Displacement 00001111 10000110 | fuil displacement . 7+mor3 


JNBE/JA = Jump on Not Below or Equal/Above 
8-Bit Displacement | 011101411 8-bit disp! | 7+ mor3 


Full Displacement 00001111 1! 10000111 | full displacement 7+mor3 
JS = Jump on Sign | . 
8-Bit Displacement 01111000 8-bit displ. 7+ mor3 


Full Displacement 00001111 | 10001000 | full displacement 7+mor3 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock 
Counts 


instruction Notes 


CONDITIONAL JUMPS (Continued) 


JNS = Jump on Not Sign 


8-Bit Displacement 01111001 8-bit disp! 


Full Displacement 00001111 10001001 | full displacement 


JP/JPE = Jump on Parity/Parity Even 
8-Bit Displacement 01111010 8-bit displ 


Full Displacement 00001111 10001010 | full displacement 


JNP/JPO = Jump on Not Parity/Parity Odd 
8-Bit Displacement 01111011 8-bit disp! 


10001011 | fulldisplacement 


Full Displacement 00001111 


JL/JNGE = Jump on Less/Not Greater or Equal 


8-Bit Displacement 01111100 8-bit displ 


Full Displacement 00001111 10001100 | full displacement 


JNL/JGE = Jump on Not Less/Greater or Equal 


8-Bit Displacement 01111101 8-bit displ 


Full Displacement 00001111 10001101 | full displacement 


JLE/JNG = Jump on Less or Equal/Not Greater 
8-Bit Displacement 01111110 8-bit disp! 


Full Displacement — 00001111 10001110 | full displacement 


JNLE/JG = Jump on Not Less or Equal/Greater 
8-Bit Displacement . 01111111 8-bit disp! 


Full Displacement 00001111 10001111 | full displacement 


JECXZ = Jump on ECX Zero 11100011 8-bit displ 


(Address Size Prefix Differentiates JCXZ from JECXZ) 


LOOP = Loop ECX Times 11100010 8-bit displ 


LOOPZ/LOOPE = Loop with 
Zero/Equal 11100001 8-bit disp! 


LOOPNZ/LOOPNE = Loop While 


Not Zero 11100000 8-bit disp! 


CONDITIONAL BYTE SET 
NOTE: Times Are Register/Memory 
SETO = Set Byte on Overfiow 

To Register/Memory 00001111 


10010000 | mod000- f/m 


SETNO = Set Byte on Not Overflow 


To Register/Memory 00001111 10010001 | mod000- r/m 


10010010 |mod000-= r/m 


SETB/SETNAE = Set Byte on Below/Not Above or Equal 
To Register/Memory | 00001111 
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Table 8.1. 80376 instruction Set Clock Count Summary (Continued) 


| Number 

Instruction Format ee of Data Notes 
Cycles 

CONDITIONAL BYTE SET (Continued) | — | 


SETNB = Set Byte on Not Below/Above or Equal 


To Register/Memory 00001111 10010011 | mod000 t/m 


SETE/SETZ = Set Byte on Equal/Zero 


To Register/Memory 000011171 10010100 | mod000 f/m 


SETNE/SETNZ = Set Byte on Not Equal/Not Zero 


To Register/Memory | 00001111 10010101 | mod000 f/m 


SETBE/SETNA = Set Byte on Below or Equal/Not Above 


To Register/Memory | 00001111 10010110 | mod000 f/m 


SETNBE/SETA = Set Byte on Not Below or Equal/Above 


To Register/Memory | 00001111 10010111 | mod000 f/m 


SETS = Set Byte on Sign . 
To Register/Memory 00001111 10011000 | mod000 r/m 


SETNS = Set Byte on Not Sign 


To Register/Memory 00001111 | 10011001 | mod000 r/m 


SETP/SETPE = Set Byte on Parity/Parity Even : 
To Register/Memory | 00001111 10011010 | mod000 f/m 
SETNP/SETPO = Set Byte on Not Parity/Parity Odd 
To Register/Memory | 00001111 10011011 |mod000 f/m 
SETL/SETNGE = Set Byte on Less/Not Greater or Equal 
To Register/Memory | 00001111 10011100 | mod000 f/m 


SETNL/SETGE = Set Byte on Not Less/Greater or Equal 


To Register/Memory | 00001111 | 01111101 | mod000 f/m 


SETLE/SETNG = Set Byte on Less or Equal/Not Greater © . 


To Register/Memory | 00001111 10011110 | mod000 f/m 


| SETNLE/SETG = Set Byte on Not Less or Equal/Greater 


To Register/Memory | 00001111 10011111 | mod000 f/m 


ENTER = Enter Procedure | 11001000 | 16-bit displacement, 8-bit level 


L i 

L=1 14 
L> 17 +8(n — 1) 
LEAVE = Leave Procedure 11001001 6 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


| Number 
Instruction Format Clock of Data Notes 
Counts 
Cycies 


INTERRUPT INSTRUCTIONS 
INT = Interrupt: 


Type Specified 11001101 type 


Via Interrupt or Trap Gate 

to Same Privilege Level c,d,j,p 
Via interrupt or Trap Gate 

to Different Privilege Level c,d,j,p 


From 386 Task to 386 TSS via Task Gate c,d,j,p 


Type 3 11001100 


Via interrupt or Trap Gate 

to Same Privilege Level |}  ¢,d,j,p 
Via Interrupt or Trap Gate : 

to Different Privilege Level c,d,j,p 


From 386 Task to 386 TSS via Task Gate | ¢,d,),p 


INTO = Interrupt 4 if Overflow Flag Set; 11001110 


ifOF = 1: 
IlfOF = 0 


Via interrupt or Trap Gate 

to Same Privilege Level c,d,j,p 
Via Interrupt or Trap Gate 

to Different Privilege Level | c,d,j,P 


From 386 Task to 386 TSS via Task Gate c,d,j,p 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 
Instruction 


INTERRUPT INSTRUCTIONS (Continued) 


Bound = Out of Range 01100010 | modreg r/m | 


Interrupt 5 if Detect Value | 


it in Range ‘| .a,c,d,j,o,p 


if Out of Range: 
Via Interrupt or Trap Gate 
to Same Privilege Level c,d,j,p 
Via Interrupt or Trap Gate ; 
to Different Privilege Level . c,d,j,p 


From 386 Task to 386 TSS via Task Gate c,d,j,p 
INTERRUPT RETURN 


IRET = Interrupt Return 11001111 


To the Same Privilege Level (within Task) | | a,c,d,j,P 
To Different Privilege Level (within Task) a,c,d,j,p 


From 386 Task to 386 TSS | ¢,d,j,p. 


PROCESSOR CONTROL 


HLT = HALT 11110100 


MOV = Move to and from Controi/Debug/Test Registers 


CRO from register | 00001111 
Register from CRO 00001111 
DRO-3 from Register | 00001111 
DR6-7 from Register 0 0001111 
Register from DR6-7 00001111 


Register from DRO-3 00001111 00100001 1 1 eee reg 


NOP = No Operation 10010000 


WAIT = Wait until BUSY Pin is Negated 10011011 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Clock 


Counts Notes 


Instruction Format 


PROCESSOR EXTENSION INSTRUCTIONS 


Processor Extension Escape 11011TTT | modLLL r/m See 80387SX Data Sheet a 


TTT and LLL bits are opcode 
information for coprocessor. 


PREFIX BYTES 


Address Size Prefix | 01100111 | | 0 
LOCK = Bus Lock Prefix 0 f 
Operand Size Prefix 0 
Segment Override Prefix 
cs: 
DS: [0071 1110 0 
ES: | 
as: 
SS: | 00110110 | 0 
PROTECTION CONTROL 


ARPL = Adjust Requested Privilege Level 
From Register/Memory | 01100011 | mod reg r/mn | 20/21** 2** a 


LAR = Load Access Rights 


From Register/Memory 00001111 | 00000010 17/18" Ai a,C,i,p 


LGDT = Load Global Descriptor 


Table Register | 00001111 00000001 | mod010 f/m 13** af | a,e 


LIDT = Loadinterrupt Descriptor __ ee ; 
Table Register . | 00001111 | 00000001 | mod011_ :r/m 1a** 3* ae 


LLDT = Load Local Descriptor 


Table Register to — 

Register/Memory 00001111 | 00000000 mod010 = r/m 24/28" 5" a,c,e,p 
LMSW =Load Machine Status Word . . 

From Register/Memory 00001111 00000001 |mod110 r/m 10/13* 4* ae 


LSL = Load Segment Limit 


From Register/Memory 00001111 00000011 | modreg r/m 


Byte-Granular Limit . 24/27* 2" | a,C.i,p 
Page-Granular Limit 29/32" 2" a,C,i,p 
LTR = Load Task Register 


From Register/Memory 00001111 00000000 | mod001 f/m 27/31" 4* a,c,e,p 


SGDT = Store Global Descriptor 


Table Register 000011711 | 00000001 | mod000 f/m} 3* a 
SIDT = Store Interrupt Descriptor 

Table Register 00001111 00000001 | mod001 = r/m 3* a 
| SLDT = Store Local Descriptor Table Register . 

To Register/Memory 00001111 | 00000000 | mod000 r/m 4* a 
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Table 8.1. 80376 Instruction Set Clock Count Summary (Continued) 


Instruction _ Format 


PROTECTION CONTROL (Continued) 


SMSW = Store Machine 
’ §tatus Word | 00001111 | | 00000001 | | mod 100. r/m. 


STR= Store Task Register 
To Register/Memory 00001111 00000000 


VERR = Verify Read Accesss 


Register/Memory 00001111 00000000 {| mod100 f/m 10/11** a,c,i,p 
VERW = Verify Write Accesss 00001111 00000000 | mod101 f/m — -15/16** a,C,i,p 


NOTES: 

a. Exception 13 fault (general violation) will occur if the memory operand in CS, DS, ES, FS or GS cannot be used due to 
either a segment limit violation or access rights violation. If a stack limit is violated, and exception 12 (stack segment limit 
violation or not present) occurs. 
b. For segment load operations, the CPL, RPL and DPL must agree with the privilege rules to aod an exception 13 fault 
(general protection violation). The segments’s descriptor must indicate “present” or exception 11 (CS, DS, ES, FS, GS not 
present). If the SS register is loaded and a stack segment not present is detected, an exception 12 (stack segment limit: 
violation or not present occurs). 

c. All segment descriptor accesses in the GDT or LDT made by this instruction will automatically assert LOCK to maintain 
descriptor integrity in multiprocessor systems. 

d. JMP, CALL, INT, RET and IRET instructions referring to another code ecament” will cause an exception 13 (general 
protection violation) if an applicable privilege rule is volated. 

e. An exception 13 fault occurs if CPL is greater than 0. 

f. An exception 13 fault occurs if CPL is greater than IOPL. 

g. The IF a of the flag register is not updated if CPL is greater than IOPL. The IOPL field of the flag register is updated only 
if CPL = 

h. Any nee of privelege rules as applied to the selector operand does not cause a protection ia ad rather, the zero 
flag is cleared. 

i. If the coprocessor’s memory operand violates a segment limit or segment access rights, an exception 13 fault (general 
protection exception) will occur before the ESC instruction is executed. An exception 12 fault (stack segment limit violation 
or no present) will occur if the stack limit is violated by the operand’s starting address. 

j. The destination of a JMP, CALL, INT, RET or IRET must be in the defined limit of a code segment or an exception 13 fault 
(general protection violation) will occur. 

k. If CPL < IOPL 

If CPL > lOPL 

m. LOCK is automatically asserted, regardless of the presence or absence of the LOCK prefix. 

n. The 80376 uses an early-out multiply algorithm. The actual number of clocks depends on the position of the most signifi- 
cant bit in the operand (multiplier). Clock counts given are minimum to maximum. To caiculate actual clocks use the follow- 
ing formula: 

Actual Clock = if m < > 0 then max ([logo |ml], 3) + 9 clocks: 

7 if m = 0 then 12 clocks (where m is the multiplier) 

o. An exception may occur, depending on the value of the operand. 
p. LOCK is asserted during descriptor table accesses. 
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8.2 INSTRUCTION ENCODING 


Overview 


All instruction encodings are subsets of the general 
instruction format shown in Figure 8.1. Instructions 
consist of one or two primary opcode bytes, possibly 
an address specifier consisting of the “mod r/m” 
byte and “scaled index” byte, a displacement if re- 
quired, and an immediate data field if required. 


Within the primary opcode or opcodes, smaller en- 
coding fields may be defined. These fields-vary ac- 
cording to the class of operation. The fields define 
such information as direction of the operation, size 
of the displacements, register encoding, or sign ex- 
tension. | 


Almost all instructions referring to an operand in 
memory have an addressing mode byte following 
the primary opcode byte(s). This byte, the mod r/m 
byte, specifies the address mode to be used. Certain 
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encodings of the mod r/m byte indicate a second 
addressing byte, the scale-index-base byte, follows 
the mod r/m byte to fully specify the addressing 
mode. 


Addressing modes can include a displacement im- 
mediately following the mod r/m byte, or scaled in- 
dex byte. If a displacement is present, the possible 
sizes are 8, 16 or 32 bits. 


If the instruction specifies an immediate operand, 
the immediate operand follows any displacement 
bytes. The immediate operand, if specified, is aMeys 
the last field of the instruction. 


Figure 8.1 illustrates several of the fields that can 
appear in an instruction, such as the mod field and 
the r/m field, but the Figure does not show all fields. 
Several smaller fields also appear in certain instruc- 
tions, sometimes within the opcode bytes them- 
selves. Table 8.2 is a complete list of all fields ap- 
pearing in the 80376 instruction set. Further ahead, 
following Table 8.2, are detailed tables for each 
field. 


TTTTTTTT|TTTTTTTT| modTTTr/m| ssindex base |d32 | 16| 8 | none data32 | 16| 8 | none 


0.765320 


765320 


ns a! ah seine), Oka aa | a ee eeeeneemeees 


opcode “mod r/m’”’ | 
(one or two bytes) byte 
(T represents an 
opcode bit.) 


Nee ah eee 


register and address 


“s-j-b” address immediate. 

byte displacement sss data 

(4, 2, 1 bytes (4, 2, 1 bytes 
or none) or none) 


mode specifier 


Figure 8.1. General Instruction Format 


Table 8.2. Fields within 80376 Instructions 


General Register Specifier 


or a Condition Negated 
Note: Table 8.1 shows encoding of individual instructions. 


Specifies if Data is Byte or Full Size (Full Size is either 16 or 32 Bits _ 
Specifies Direction of Data Operation 
Specifies if an Immediate Data Field Must be Sign-Extended 


Address Mode Specifier (Effective Address can be a General Register) 


Scale Factor for Scaled Index Address Mode 
General Register to be used as Index Register 
General Register to be used as Base Register 

- Segment Register Specifier for CS, SS, DS, ES 

| Segment Register Specifier for CS, SS, DS, ES, FS, GS 

For Conditional Instructions, Specifies a Condition Asserted 


2 for mod; 
3 for r/m 
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16-Bit Extensions of the 
Instruction Set 


Two prefixes, the operand size prefix (66H) and the 
effective address size prefix (67H), allow overriding 


individually the default selection of operand size and 


effective address size. These prefixes may precede 
any opcode bytes and affect only the instruction 
they precede. If necessary, one or both of the prefix- 
es may be placed before the opcode bytes. The 
presence of the operand size prefix (66H) and the 
effective address prefix will allow 16-bit data opera- 
tion and 16-bit effective address calculations. _ 


For instructions with more than one prefix, the order 
of prefixes is unimportant. 


Unless specified otherwise, instructions with 8-bit 
and 16-bit operands do not affect the contents of 
the high-order bits of the extended registers. 
Encoding of Instruction Fields 


Within the instruction are several fields indicating 
register selection, addressing mode and so on. 


ENCODING OF OPERAND LENGTH (w) FIELD 


For any given instruction performing a data opera- 


tion, the instruction will execute as a 32-bit opera- 


tion. Within the constraints of the operation size, the 
w field encodes the operand size as either one byte 
or the full operation size, as shown in the table be- 
low. 
Operand Size Normal 
with 66H Prefix Operand Size 


8 Bits | 
32 Bits 


ENCODING OF THE GENERAL 
REGISTER (reg) FIELD 


The general register is specified by the reg field, 
which may. appear in the primary opcode bytes, or as 
the reg field of the “mod r/m” byte, or as the r/m 
field of the “mod r/m” byte. 


376 EMBEDDED PROCESSOR 


PRELIMINARY 


Encoding of reg Field When w Field 
is not Present in Instruction 


Register Selected | 
During 32-Bit 
Data Operations | 


Register Selected 


reg Field with 66H Prefix 


Encoding of reg Field When w Field 
is Present in Instruction 


Register Specified by reg Fieid 
with 66H Prefix 


Function of w Field 
(when w = 0) (when w = 1) 


Register Specified by reg Field 
without 66H Prefix 


Function of w Field 


se (whenw = 0) | (whenw = 1) 
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ENCODING OF THE SEGMENT 
REGISTER (sreg) FIELD 


The sreg field in. certain instructions is a 2-bit field 
allowing one of the CS, DS, ES or SS segment regis- 
ters to be specified. The sreg field in other instruc- 
tions is a 3-bit field, allowing the FS and GS segment 
registers to be specified also. 


2-Bit sreg2 Field 


Segment 
Register 
Selected 


2-Bit 
sreg2 Field 


3-Bit sreg3 Field 


Segment 
Register 
Selected 


3-Bit 
sreg3 Field 


do not use 
do not use 
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ENCODING OF ADDRESS MODE 


Except for special instructions, such as PUSH or 
POP, where the addressing mode is pre-determined, 
the addressing mode for the current instruction is 
specified by addressing bytes following the primary 
opcode. The primary addressing byte is the “mod 
r/m’’ byte, and a second byte of addressing informa- 
tion, the ‘‘s-i-b’ (scale-index-base) byte, can be 
specified. 


The s-i-b byte (scale-index-base byte) is specified 
when using 32-bit addressing mode and the ‘mod 
r/m’” byte has r/m = 100 and mod = 00, 01 or 10. 
When the sib byte is present, the 32-bit addressing 
mode is a function of the mod, ss, index, and base 
fields. 


The primary addressing byte, the “mod r/m” byte, 
also contains three bits (shown as TTT in Figure 8.1) 
sometimes used as an extension of the primary op- 
code. The three bits, however, may also be used as 
a register field (reg). , 


When calculating an effective address, either 16-bit 
addressing or 32-bit addressing is used. 16-bit ad- 
dressing uses 16-bit address components to calcu- 
late the effective address while 32-bit addressing 
uses 32-bit address components to calculate the ef- 
fective address. When 16-bit addressing is used, the 
“mod r/m’ byte is interpreted as a 16-bit addressing 
mode specifier. When 32-bit addressing is used, the 
“mod r/m”’ byte is interpreted as a 32-bit addressing 
mode specifier. 


Tables on the following three pages define all en- 
codings of all 16-bit addressing modes and 32-bit 
addressing modes. | 
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Encoding of Normal Address Mode with “mod r/m” byte (no “s-i-b” byte present): | 


| —smodr/m | Effective Address | — . modr/m | Effective Address - S 


00 000 ~ DS:[EAX] 10 000 — DS:[EAX+ d32] 
00001 DS:[ECX] 10 001 DS:[ECX + d32] 
00010 =| DS: [EDX] 10010 | DS: [EDX + d32] 
00 011 DS:[EBX] 10011 DS:[EBX + d32] 
00 100 | s-i-b is present 10 100 s-i-b is present 
00 101 | DS:d32 10 101 SS:[EBP + d32] 
00110 | Ds:[ESI] 10 110 | DS:[ESI + d32] 

00111. DS:[EDI] 10 111 DS:[EDI + d32] 


01 000 DS:[EAX + d8] 

01 001 DS: [ECX + d8] 
01010 DS: [EDX + d8] 
01 011 DS: [EBX + d8] 

01100 © s-i-b is present - 

01 101 SS:[EBP+d8] ~ 
01110 | DS: [ESI + d8] 
01111 DS:[EDI+ d8] 


11 000 register—see below 
11 001 register—see below 
11010 register—see below 
11011 register—see below 
11 100 register—see below 
11101 register—see below 
11110 register—see below 
11111 register—see below 


| Register Specified by reg or r/m , 
during Normal Data Operations: 


| . function of w field 
mod r/m 
(when w= 0) (when w=1) | 


11 000 


BH | | 
Register Specified by reg or r/m 
during 16-Bit Data Operations: (66H Prefix) 
function of w field | 
mod r/m 
(when w= 0) (when w= 1) 
AX 
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Encoding of 16-bit Address Mode with “mod r/m” Byte Using 67H Prefix 


Effective Address 


DS:[BX + Si+d16] 
DS:[BX + DI+ d16] 
SS:[BP + SI+ d16] 
SS:[BP + DI+ d16] 
DS:[SI+ d16] 

DS:[Di+ d16] 

SS:[BP + d16] 
DS:[BX + d16] 


mod r/m Effective Address 


DS:[BX + SI] 
DS: [BX + Dl] 
SS:[BP + SI] 
SS:[BP + Di] 
Ds:[Si] 
DS:[D!] 
DS:d16 

DS: [BX] 


register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 
register—see below 


DS:([BX+ SI+ d8] 
DS:[BX + DI+ d8] 
SS:[BP + SI+ d8] 
SS:[BP + DI+ d8] 
DS:[SI + d8] 
DS:[D!i+ dg] 
SS:[BP + d8] 
DS:[BX + d8] 
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~ Encoding of 32-bit Address Mode (“mod r/m” byte and “s-i-b” byte present): 


____ Effective Address | 


00 000 DS:[EAX + (scaled index)] 
DS:[ECX + (scaled index)] 
DS: [EDX + (scaled index)] 
DS: [EBX + (scaled index)] 
SS:[ESP + (scaled index)] 
DS:[d32 + (scaled index)] 
DS:[ESI-+ (scaled index)] 
DS: [EDI + (scaled index)] 


DS:[EAX + (scaled index) + d8] 
DS:[ECX + (scaled index) + d8] 
DS: [EDX + (scaled index) + d8] 
DS: [EBX + (scaled index) + d8] 
SS:[ESP + (scaled index) + d8] 
SS:[EBP + (scaled index) + d8] 
DS: [ESI + (scaled index) + d8] 

DS: [EDI + (scaled index) + d8] 


DS:[EAX + (scaled index) + d32] 
DS:[ECX + (scaled index) + d32] 
DS:[EDX + (scaled index) + d32] 
DS: [EBX + (scaled index) + d32] 
SS: [ESP + (scaled index) + d32] 
SS:[EBP + (scaled index) + d32] 
DS:[ESI+ (scaled index) + d32] 
DS:[EDI+ (scaled index) + d32] 


NOTE: | : 
Mod field in ‘mod r/m” byte; ss, index, base fields in 
“‘s-i-b” byte. 


ss Scale Factor 
00 . x1 


x2 


| index |__index Register 


EAX 
ECX 
EDX 


EBX 
no index reg** 
EBP 
ESI 
EDI 


**IMPORTANT NOTE: 

When index field is 100, indicating “no index register,” then 
ss field MUST equal 00. If index is 100 and ss does not 
equal 00, the effective address is undefined. 


5-1308 


ENCODING OF OPERATION 
~ DIRECTION (d) FIELD 


In many two-operand instructions the d field is pres- 
ent to indicate which operand is considered the 
source and which is the destination. 


id | Direction of Operation 


Register/Memory <- - Register 
“reg” Field Indicates Source Operand; 

“mod r/m” or “mod ss index base”’ Indicates 
Destination Operand 


Register <- - Register/Memory 
“reg” Field Indicates Destination Operand; 

“mod r/m’’ or ‘‘mod ss index base” Indicates 
Source Operand 


ENCODING OF SIGN-EXTEND (s) FIELD 


The s field occurs primarily to instructions with im- 
mediate data fields. The s field has an effect only if 
the size of the immediate data is 8 bits and is eee 
placed in a 16-bit or 32-bit destination. : 


lO|None None 


1|Sign-Extend Data8 to Fill 
16-Bit or 32-Bit Destination 


ENCODING OF CONDITIONAL 
TEST (tttn) FIELD 


For the conditional instructions (conditional jumps 
and set on condition), tttn is encoded with n indicat- 
ing to use the condition (n= 0) or its negation (n= 1), 
and tit giving the condition to test. 
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Immediate Data8 __|immediate Data 16|32 
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Below/Not Above or Equal 
Not Below/Above or Equal 
Equal/Zero 

Not Equal/Not Zero 

Below or Equal/Not Above 


Not Below or Equal/Above 

Sign 

Not Sign 

|Parity/Parity Even 

Not Parity/Parity Odd 

Less Than/Not Greater or Equal 
|Not Less Than/Greater or Equal 


ENCODING OF CONTROL OR DEBUG 
REGISTER (eee) FIELD 


For the loading and storing of the Control and Debug 
registers. 


When interpreted as Control Register Field 


Reg Name 


000 CRO 
010 Reserved 
011 Reserved 


Do not use any other encoding 


‘When Interpreted as Debug Register Field 


Do not use any other encoding - 
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9.0 REVISION HISTORY 


The sections significantly revised since version “003 are: 


Section 1.0 
Section 4.4 
Section 4.6 
Section 5.0 


Section 6.3 


' Section 6.4 


_ Added FLT pin. 3 . 
- Added description of FLOAT operation and ONCE Mode. Figure 4. 20 is new. 


Added revision identifier information for change to CHMOS IV manufacturing process. 


Both packages now specified for 0°C-115°C case temperature operation. Thermal resist- 
ance values changed. 


Icc Max. specifications changed from 400 mA (cold) and 360 mA (hot) to 275 mA (cold, 16 
MHz) and 305 mA (cold, 20 MHz). 


HLDA Valid Delay, t;4, min. changed from 6 ns to 4 ns. Added 20 MHz A, C. specifications in 
Table 6.5. Replaced Capacitive Derating Curves in Figures 6.8-6.10 to reflect new manufac- 
turing process. Replaced Icc vs. Frequency data (Figure 6.11) to reflect new specifications. 


‘The sections significantly revised since version -002 are: 


Section 1.0 


Modified table 1.1. to list pins in alphabetical order. 


The sections significantly revised since version -001 are: 


Section 2.0 


Section 2.1 
Section 2.1. 


Section 2.3 
Section 2.6 


Section 2.8 


Section 2.10 


| Section 3.0 
Section 3.2 


Section 3.2 
Section 3.3 


Section 4.1 


Section 4.1 


Section 4.2 


Figure 2.0 was updated to show the 16-bit registers SI, DI, BP and SP. | 
Figure 2.2 was updated to show the correct bit polarity for bit 4.in the CRO register. 


Tables 2.1 and 2.2 were updated to include additional information on the EFLAGs and CRO 
registers. 


_ Figure 2.3 was updated to more accurately reflect the sadness mechanism of the 80376. 


In the subsection Maskable Interrupt a paragraph was added to describe the effect of 
interrupt gates on the IF EFLAGs bit. 


Table 2.7 was updated to reflect the correct power up condition of the CRO register. 

Figure 2.6 was updated to show the correct bit positions of the BT, BS and BD bits in the 
DR6 register. 

Figure 3.1 was updated to clearly show the address calculation process. _ 

The subsection DESCRIPTORS was elaborated upon to clearly define the relationship be- 
tween the linear address space and physical address space of the 80376. 7 

Figures 3.3 and 3.4 were updated to show the AVL bit field. 


The last sentence in the first paragraph of subsection PROTECTION AND I/O PERMIS- 
SION BIT MAP was deleted. This was an incorrect statement. 


In the Subsection ADDRESS BUS (BHE, BLE, Az3-A; last sentence in the first paragraph 
was updated to reflect the numerics operand addresses as 8000FCH and 8000FEH. Be- 
cause the 80376 sometimes does a double word I/O access a second access to 8000FEH 
can be seen. 


The Subsection Hold Lantencies was updated 1 to describe how 32-bit and unaligned ac- | 
cesses are internally locked but do not assert the LOCK signal. 


Table 4.6 was updated to show the correct active data bits during a BLE assertion. 
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9.0 REVISION HISTORY (Continued) 


Section 4.4 


Section 4.6 
Section 4.7 


Section 5.0 
Section 6.2 


Section 6.4 
Section 6.4 
Section 8.1 
Section 8.2 


This section was updated to correctly reflect the pipelining of the address and status of the 
80376 as opposed to “Address Pipelining” which occurs on processors such as the 80286. 


Table 4.7 was updated to show the correct Revision number, 05H. 


Table 4.8 was updated to show the numerics operand register 8000FEH. This address is 
seen when the 80376 does a DWORD operation to the port address 8000FCH. 


In the first paragraph the case temperatures were updated to reflect the 0°C-115°C for the 
ceramic package and 0°C-110°C for the plastic package. | 


Table 6.2 was updated to reflect the Case Temperature under Bias specification of —65°C- 
120°C. 


Figure 6.8 vertical axis was updated to reflect ‘Output Valid Delay (ns)’. 

Figure 6.11 was updated to show typical Icc vs Frequency for the 80376. | 

The clock counts and opcodes for various instructions were updated to their correct values. 
The section INSTRUCTION ENCODING was appended to the data sheet. 
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82370 
INTEGRATED SYSTEM PERIPHERAL 


i High Performance 32-Bit DMA 


Controller for 16-Bit Bus 

— 16 MBytes/Sec Maximum Data 
Transfer. Rate at 16 MHz 

— 8 Independently Programmable 
Channels 


m 20-Source interrupt Controller 
— Individually Programmable interrupt 
Vectors 
— 15 External, 5 Internal interrupts 
-— 82C59A Superset 
m Four 16-Bit Programmable Inierval 
Timers 
— 82C54 Compatible 
= Software Compatible to 82380 


~@ Programmable Wait State Generator 


—0 to 15 Wait States Pipelined 
— 1to 16 Wait States Non-Pipelined 


DRAM Refresh Controller 


m 80376 Shutdown Detect and Reset 
Control 
— Software/Hardware Reset 


High Speed CHMOS Ili Technology 


m@ 100-Pin Plastic Quad Flat-Pack Package 


and 132-Pin Pin Grid Array Package 


(See Packaging Handbook Order #231369) 


@ Optimized for Use with the 80376 
Microprocessor 
— Resides on Local Bus for Maximum 
Bus Bandwidth 


The 82370 is a multi-function support peripheral that integrates system functions necessary in an 80376 
environment. It has eight channels of high performance 32-bit DMA (32-bit internal, 16-bit external) with the 
most efficient transfer rates possible on the 80376 bus. System support peripherals integrated into the 82370 
provide Interrupt Control, Timers, Wait State generation, DRAM Refresh Control, and System Reset logic. 


The 82370’s DMA Controller can transfer data between devices of different data path widths using a single 


channel. Each DMA channel operates independently in any of several modes. Each channel has a temporary 
data storage register for handling non-aligned data without the need for external alignment logic. 


Boer? LOCAL BUS 


INTERNAL BUS | 
| ARBITRATION ! 
| AND CONTROL Jp} 


| WAIT = STATE [eqns 
CONTROL | 


DRAM 
REFRESH 
j CONTROLLER j 


genee 5 iNernal requests 
20 = LEVEL. 


paaad 
| INTERRUPT [——~ 
| CONTROLLER 


CPU 
: RESET 
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Pin Descriptions : : exist on the 82370 for interfacing the system support 
peripherals to their respective system counterparts. 

The 82370 provides all of the signals necessary to Following are the definitions of the individual pins of 
interface an 80376 host processor. It has a separate the 82370. These brief descriptions are provided as 
24-bit address and 16-bit data bus. It also has a set a reference. Each signal is further defined within the 
of control signals to support operation as a bus mas- sections which describe the associated 82370 func- — 
ter or a bus slave. Several special function signals tion. 


Name and Function - 


‘ADDRESS BUS: Outputs physical memory or port I/O addresses. See 
Address Bus (2.2.3) for additional information. | 


BYTE ENABLES: Indicate which data bytes of the data bus take partinabus | 
cycle. See Byte Enable (2.2.4) for additional information. | 


DATA BUS: This is the 16-bit data bus. These pins are active outputs during 
interrupt acknowledges, during Slave accesses, and when the 82370 is in the 
Master Mode. | . | 


PROCESSOR CLOCK: This pin must be connected to the processor’s clock, 
CLK2. The 82370 monitors the phase of this clock in order to remain — 
synchronized with the CPU. This clock drives all of the internal synchronous 

circuitry. 


DATA/CONTROL: D/C # is used to distinguish between CPU control cycles | 
and DMA or CPU data access cycles. It is active as an output only in the 
Master Mode. 


WRITE/READ: W/R # is used to distinguish between write and read cycles. It 
is active as an output only in the Master Mode. 


MEMORY/IO: M/IO # is used to distinguish between memory and IO 
accesses. It is active as an output only in the Master Mode. 


ADDRESS STATUS: This signal indicates presence of a valid address on the 
address bus. It is active as output only in the Master Mode. ADS # is active 
during the first T-state where addresses and control signals are valid. 


NEXT ADDRESS: Asserted by a peripheral or memory to begin a pipelined 
address cycle. This pin is monitored only while the 82370 is in the Master 

_ Mode. In the Slave Mode, pipelining is determined by the current and past 

_ status of the ADS# and READY # signals. | 


HOLD REQUEST: This is an active-high signal to the Bus Master to request 
control of the system bus. When control is granted, the Bus Master activates 
the hold acknowledge signal (HLDA). 


HOLD ACKNOWLEDGE: This input signal tells the DMA controller that the 
Bus Master has relinquished control of the system bus to the DMA controller. 
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Pin Descriptions (Continued) 


Symbol Name and Function 


DREQ (0-3, 5-7) ¥ DMA REQUEST: The DMA Request inputs monitor requests from peripherals 


requiring DMA service. Each of the eight DMA channels has one DREQ input. 
These active-high inputs are internally synchronized and prioritized. Upon 
request, channel 0 has the highest priority and channel 7 the lowest. 
DREQ4/IRQ9 # DMA/INTERRUPT REQUEST: This is the DMA request input for channel 4. It 
is also connected to the interrupt controller via interrupt request 9. This 
EDACK (0-2) 
IRQ (11-23) # 
pull-up resistors. Unused interrupt request inputs should be tied inactive 
externally. | 
PINT = | OO INTERRUPT OUT: INT Saecis that an interrupt request is pending. 
CLKIN TIMER CLOCK INPUT: This is the clock input signal to all of the 82370’s 
programmable timers. It is independent of the system clock input (CLK2). 


internal connection is available for DMA channel 4 only. The interrupt input is 

TOUT1/REF # TIMER 1 OUTPUT/REFRESH: This pin is software programmable as either | 
the direct output of Timer 1, or as the indicator of a refresh cycle in progress. 
As REF #, this signal is active during the memory read cycle which occurs 
during refresh. 


active low and can be programmed as either edge or level triggered. Either 
function can be masked by the appropriate mask register. Priorities of the 
DMA channel and the interrupt request are not related but follow the rules of 
the individual controllers. 


Note that this pin has a weak internal pull-up. This causes the interrupt 
request to be inactive, but the DMA request will be active if there is no 
external connection made. Most applications will require that either one or the 
other of these functions be used, but not both. For this reason, it is advised 
that DMA channel 4 be used for transfers where a software request is more 
appropriate (such as memory-to-memory transfers). In such an application, 
DREQ4 can be masked by software, freeing IRQ9 # for other purposes. 


END OF PROCESS: As an output, this signal indicates that the current 
Requester access is the last access of the currently operating DMA channel. 
It is activated when Terminal Count is reached. As an input, it signals the DMA 
channel to terminate the current buffer and proceed to the next buffer, if one 
is available. This signal may be programmed as an asynchronous or 
synchronous input. 


EOP # must be connected to a pull-up resistor. This will prevent erroneous | 
external requests for termination of a DMA process. 


ENCODED DMA ACKNOWLEDGE: These signals contain the encoded 
acknowledgment of a request for DMA service by a peripheral. The binary 
code formed by the three signals indicates which channel is active. Channel 4 
does not have a DMA acknowledge. The inactive state is indicated by the 
code 100. During a Requester access, EDACK presents the code for the 
active DMA channel. During a Target access, EDACK presents the inactive 
code 100. 


INTERRUPT REQUEST: These are active low interrupt request inputs. The 
inputs can be programmed to be edge or level sensitive. Interrupt priorities 
are programmable as either fixed or rotating. These inputs have weak internal 
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TIMER 2 OUTPUT/INTERRUPT REQUEST: This is the inverted output of 
Timer 2. It is also connected directly to interrupt request 3. External hardware 
can use IRQ3 # if Timer 2 is programmed as OUT =0 (TOUT2# = 1). 


TOUTS # | TIMER 3 OUTPUT: This is the inverted output of Timer 3. 


READY # READY INPUT: This active-low input indicates to the 82370 that the current 
| bus cycle is complete. READY is sampled by the 82370 both while it is in the 
Master Mode, and while it is in the Slave Mode. 


WSC (0-1) | WAIT STATE CONTROL: WSCO and WSC1 are inputs used by the Wait- 
| State Generator to determine the number of wait states required by the 
currently accessed memory or I/O. The binary code on these pins, combined 
with the M/IO # signal, selects an internal register in which a wait-state count 
is stored. The combination WSC = 11 disables the wait-state generator. 


READYO # | READY OUTPUT: This is the synchronized output of the wait-state generator. 
It is also valid during CPU accesses to the 82370 in the Slave Mode when the 
_ 82370 requires wait States. READYO# should feed directly the processor’s 


READY # input. 


| RESET RESET: This synchronous input serves to initialize the state of the 82370 and | 
_ | provides basis for the CPURST output. RESET must be held active for at least | 
_| 15 CLK2 cycles in order to guarantee the state of the 82370. After Reset, the 
82370 is in the Slave Mode with all outputs except timers and interrupts in 
their inactive states. The state of the timers and interrupt controller must be 


initialized through software. This input must be active for the entire time 
required by the host processor to guarantee proper reset. 


CHPSEL # CHIP SELECT: This pin is driven active whenever the 82370 is addressed in a 
- slave bus read or write cycle. It is also active during interrupt acknowledge 
_. . | cycles when the 82370 is driving the Data Bus. It can be used to control the 


local bus transceivers to prevent contention with the system bus. 


aibeee al | CPU RESET: CPURST provides a synchronized reset signal for the CPU. It is 
activated in the event of a software reset command, a processor shut-down 
detect, or a hardware reset via the RESET pin. The 82370 holds CPURST 
active for 62 clocks in response to either a software reset command or a shut- 
down detection. Otherwise CPURST reflects the RESET input. 


Table 1. Wait-State Select Inputs 


Port Wait-State Registers Select Inputs 
Address | 07 D4 | D3 DO | wWSC1 WSCO 


MEMORY 0 1/00 


MEMORY 1 I/O 1 
MEMORY2! 1/02 
DISABLED 
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READY # 
READYO # 
HOLD 
HLDA 
INT 
EOP # 
EDACK2 
EDACK1 
EDACKO 
DREQ7 
DREQ6 
DREQ5 


DREQ3 
DREQ2 
DREQ1 


1.0 FUNCTIONAL OVERVIEW 


The 82370 contains several independent functional 
modules. The following is a brief discussion of the 
components and features of the 82370. Each mod- 
ule has a corresponding detailed section later in this 
data sheet. Those sections should be referred to for 
design and programming information. 


1.1 82370 Architecture 


The 82370 is comprised of several computer system 
functions that are normally found in separate LSI 
and VLSI components. These include: a high-per- 
formance, eight-channel, 32-bit Direct Memory Ac- 
cess Controller; a 20-level Programmable Interrupt 


DREQ4/IRQQ# 


P14 Voc 


82370 


DREQO 


P6 IRQ23 # 
N6 IRQ22# 
M7 IRQ21# 
N7 IRQ20 # 
P7 IRQ19# 
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M8 IRQ17 # 
N8 IRQ16# 
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N9Q IRQ14# 
M9 IRQ13 # 


N10 IRQ12# 

P10 IRQ11 # 

M5 WSCO 

M6 WSC1 

M13 TOUT3 # 

N13 TOUT2 #4 /IRQ3 # 
K13 TOUT1/REF # 


N11 CLKIN 
Al Vss 
C1 Vss 
N1 Vss 
N2 | Vgg 
A3 Vss 
A13 Vss 
P13 Vss 
Bi4 Vss 
L14 Vss 
N14 Vss 
B1 Voc 
D1 Voc 


Controller which is a superset of the 82C59A; four 


. 16-bit Programmable interval Timers which are func- 
~ tionally equivalent to the 82C54 timers; a DRAM Re- 


fresh Controller; a Programmable Wait State Gener- 
ator; and system reset logic. The interface to the - 
82370 is optimized for high-performance operation 
with the 80376 microprocessor. 


The 82370 operates directly on the 80376 bus. In 
the Slave Mode, it monitors the state of the proces- 
sor at all times and acts or idles according to the 
commands of the host. It monitors the address pipe- 
line status and generates the programmed number 
of wait states for the device being accessed. The 
82370 also has logic to the reset of the 80376 via 
hardware or software reset requests and processor 
shutdown status. ; 
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After a system reset, the 82370 is in the Slave 
Mode. It appears to the system as an I/O device. It 
becomes a bus master when it is performing DMA 
transfers. | 


To maintain compatibility with existing software, the 
registers within the 82370 are accessed as bytes. If 
the internal logic of the 82370 requires a delay be- 
fore another access by the processor, wait states 


82370 


are automatically inserted into the access cycle. 
This allows the programmer to write initialization rou- 
tines, etc. without regard to hardware recovery 
times. 


Figure 1-1 shows the basic architectural compo- 
nents of the 82370. The following sections briefly 
discuss the architecture and function of each a the 
distinct sections of the 82370. 
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Figure 1-1. Architecture of the 82370 
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1.1.1 DMA CONTROLLER 


The 82370 contains a high-performance, 8-channel 
DMA Controller. It provides a 32-bit internal data 
path. Through its 16-bit external physical data bus, it 
is capable of transferring data in any combination of 
bytes, words and double-words. The addresses of 
both source and destination can be independently 
incremented, decremented or held constant, and 
cover the entire 16-bit physical address space of the 
80376. It can disassemble and assemble non- 
aligned data via a 32-bit internal temporary data 
storage register. Data transferred between devices 
of different data path widths can also be assembled 
and disassembled using the internal temporary data 
storage register. The DMA Controller can also trans- 
fer aligned data between !/O and memory on the fly, 
allowing data transfer rates up to 16 megabytes per 
second for an 82370 operating at 16 MHz. Figure 
1-2 illustrates the functional components of the DMA 
Controller. 


There are twenty-four general status and command 


registers in the 82370 DMA Controller. Through 
these registers any of the channels may be pro- 
grammed into any of the possible modes. The oper- 
ating modes of any one channel are independent of 
the operation of the other channels. . 


Each channel has three programmable registers 
which determine the location and amount of data to 
be transferred: 


Byte Count Register— Number of bytes to trans- 
fer. (24-bits) 


Requester Register — Byte Address of memory 
or peripheral which is re- 
questing DMA _ service. 
(24-bits) 


— Byte Address of peripheral 
or memory which will be 
accessed. (24-bits) 


Target Register 


There are also port addresses which, when ac- 
cessed, cause the 82370 to perform specific func- 
tions. The actual data written doesn’t matter, the act 
of writing to the specific address causes the com- 
mand to be executed. The commands which operate 
in this mode are: Master Clear, Clear Terminal Count 
Interrupt Request, Clear Mask Register, and Clear 
Byte Pointer Flip-Flop. 


DMA transfers can be done between all combina- 
tions of memory and !/O; memory-to-memory, mem- 
ory-to-I/O, I/O-to-memory, and |/O-to-I/O. DMA 
service can be requested through software and/or 
hardware. Hardware DMA acknowledge signals are 
available for all channels (except channel 4) through 
an encoded 3-bit DMA acknowledge bus 


(EDACKO-2). 
ar , CONTROL/STATUS REGISTERS CHANNEL REGISTERS 
| COMMAND REGISTER I "BASE CURRENT | TEMPORARY 
DREQO - | COMMAND REGISTER Ir [BYTE COUNT | BYTE COUNT] REGISTER 
DREQi—— MODE REGISTER I BASE | CURRENT 
REQUESTER | REQUESTER CHANNEL O 
DREQ2 —PT DMA MODE REGISTER IL ADDRESS ADDRESS | 
DREQ4 my ARBITRATION REGISTER TARGET TARGET 
DREQS cata MASK REGISTER ADDRESS | ADDRESS | 
Sa ae STATUS REGISTER CHANNEL 1 (SAME ASCH 0) | 
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Figure 1-2. 82370 DMA Controller 
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The 82370 DMA Controller transfers blocks of data 
(buffers) in three. modes: Single Buffer, Buffer Auto- 
Initialize, and Buffer Chaining. In the Single Buffer 
Process, the 82370 DMA Controller is programmed 
to transfer one particular block of data. Successive 
transfers then require reprogramming of the DMA 
channel. Single Buffer transfers are useful in sys- 
tems where it is known at the time the transfer be- 
gins what quantity of data is to be transferred, and 
there is a contiguous block of data area available. 


The Buffer Auto-initialize Process allows the same 
data area to be used for successive DMA transfers 
without having to reprogram the channel. 


The Buffer Chaining Process allows a program to 
specify a list of buffer transfers to be executed. The 
82370 DMA Controller, through interrupt routines, is 
reprogrammed from the list. The channel is repro- 
grammed for a new buffer before the current buffer 
transfer is complete. This pipelining of the channel 
programming process allows the system to allocate 
non-contiguous blocks of data storage space, and 
transfer all of the data with one DMA process. The 
buffers that make up the chain do not have to be in 
outer 


Channel priority can be fixed or rotating. Fixed priori- 


ty allows the programmer to define the priority of 
DMA channels based on hardware or other fixed pa- 


INTERNAL DATA B 


p CONTROL 
REGISTERS 


CONTROL 
LOGIC 


one 


CLKIN 


US 


i 


STATUS 
LATCH 


OUTPUT CONTROL LOGIC 


82370 


rameters. Rotating priority is used to provide periph- 
erals access to the bus on a shared basis. 


With fixed priority, the programmer. can set any 
channel to have the current lowest priority. This al- 
lows the user to reset or manually rotate the priority 
schedule without reprogramming the command cs 
isters. 


1.1.2 PROGRAMMABLE INTERVAL TIMERS» 


Four 16-bit programmable interval timers reside 
within the 82370. These timers are identical in func- 
tion to the timers in the 82C54 Programmable Inter- 
val Timer. All four of the timers share a common 
clock input which can be independent of the system 
clock. The timers are capable of operating in six dif- 
ferent modes. In all of the modes, the current count 
can be latched and read by the 80376 at any time, 
making these very versatile event timers. Figure 1-3 _ 
shows the functional components of the Program- 
mable Interval Timers. 


The outputs of the timers are directed to key system 
functions, making system design simpler. Timer 0 is 
routed directly to an interrupt input and is not avail- 
able externally. This timer would typically be used to 


generate time-keeping interrupts. 


INPUT/OUTPUT 
| LATCHES |. 
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_ Figure 1-3. Programmable Interval Timers—Block Diagram 
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Timers 1 and 2 have outputs which are available for 
general timer/counter purposes as well as special 
functions. Timer 1 is routed to the refresh control 
logic to provide refresh timing. Timer 2 is connected 


to an interrupt request input to provide other timer — 


functions. Timer 3 is a general purpose timer/coun- 
ter whose output is available to external hardware. It 
is also connected internally to the interrupt request 
which defaults to the highest priority (IRQO). 


1.1.3 INTERRUPT CONTROLLER 


The 82370 has the equivalent of three enhanced 
82C59A Programmable Interrupt Controllers. These 
controllers can all be operated in the Master Mode, 
but the priority is always as if they were cascaded. 


There are 15 interrupt request inputs provided for — 


the user, all of which can be inputs from external 
slave interrupt controllers. Cascading 82C59As to 
these request inputs allows a possible total of 120 
external interrupt requests. Figure 1-4 is a block dia- 
gram of the 82370 Interrupt Controller. 


Each of the interrupt request inputs can be individu- 
ally programmed with its own interrupt vector, allow- 
ing more flexibility in interrupt vector mapping than 


82370 


was available with the 82C59A. An interrupt is pro- 
vided to alert the system that an attempt is being 
made to program the vectors in the method of the 
82C59A. This provides compatibility of existing soft- 
ware that used the 82C59A or 8259A with new de- 
signs using the 82370. 


In the event of an unrequested or otherwise errone- 
ous interrupt acknowledge cycle, the 82370 Interrupt 
Controller issues a default vector. This vector, pro- 
grammed by the system software, will alert the sys- 
tem of unsolicited interrupts of the 80376. 


The functions of the 82370 Interrupt Controller are 
identical to the 82C59A, except in regards to pro- 
gramming the interrupt vectors as mentioned above. 
Interrupt request inputs are programmable as either 
edge or level triggered and are software maskable. 
Priority can be either fixed or rotating and interrupt 
requests can be nested. 


Enhancements are added to the 82370 for cascad- 
ing external interrupt controllers. Master to Slave 
handshaking takes place on the data bus, instead of 
dedicated cascade lines. 
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Figure 1-4. 82370 Interrupt Controller—Block Diagram 
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1.1.4 WAIT STATE GENERATOR 


The Wait State Generator is a programmable 
READY generation circuit for the 80376 bus. A pe- 
ripheral requiring wait states can request the Wait 
State Generator to hold the processor’s READY in- 
put inactive for a predetermined number of bus 
states. Six different wait state counts can be pro- 
grammed into the Wait State Generator by software; 
three for memory accesses and three for !/O ac- 
cesses. A block diagram of the 82370 Wait State 
Generator is shown in Figure 1-5. 


The peripheral being accessed selects the required 
wait state count by placing a code on a 2-bit wait 
state select bus. This code along with the M/IO# 
signal from the bus master is used to select one of 
six internal 4-bit wait state registers which has been 
programmed with the desired number of wait states. 
From zero to fifteen wait states can be programmed 
into the wait state registers. The Wait State genera- 
tor tracks the state of the processor or current bus 
master at all times, regardless of which device is the 
current bus master and regardless of whether or not 
the wait state generator is currently active. 


The 82370 Wait State Generator is disabled by mak- 
ing the select inputs both high. This allows hardware 
which is intelligent enough to generate its own ready 
signal to be accessed without penalty. As previously 
mentioned, deselecting the Wait State Generator 


does not disable its ability to determine the proper | 


number of wait states due to pipeline status in sub- 
sequent bus cycles. | 


The number of wait states inserted into a pipelined 
bus cycle is the value in the selected wait state reg- 
ister. If the bus master is operating in the non-pipe- 
lined mode, the Wait State Generator will increase 
_ the number of wait states inserted into the bus cycle 
by one. | 


Pipelined 0-15 Wait States 
Non-Pipelined 0-16 Wait States 
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| Figure 1-5. 82370 Wait State Generator—Block Diagram 
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On reset, the Wait State Generator’s registers are 
loaded with the value FFH, giving the maximum 
number of wait states for any access in which the 
wait state select inputs are active. 7 


1.1.5 DRAM REFRESH CONTROLLER 


The 82370 DRAM Refresh Controller consists of a 
24-bit refresh address counter and bus arbitration 
logic. The output of Timer 1 is used to periodically 
request a refresh cycle. When the controller re- 
ceives the request, it requests access to the system 
bus through the HOLD signal. When bus control is 
acknowledged by the processor or current bus mas- 
ter, the refresh controller executes a memory read 
operation at the address currently in the Refresh Ad- 
dress Register. At the same time, it activates a re- 
fresh signal (REF #) that the memory uses to force a 
refresh instead of a normal read. Control of the bus 
is transferred to the processor at the completion of 
this cycle. Typically a refresh cycle will take six clock 
cycles to execute on an 80376 bus. 


The 82370 DRAM Refresh Controller has the high- 
est priority when requesting bus access and will in- 
terrupt any active DMA process. This allows large 
blocks of data to be moved by the DMA controller 
without affecting the refresh function. Also the DMA 
controller is not required to completely relinquish the 
bus, the refresh controller simply steals a bus cycle 
between DMA accesses. 


~The amount by which the refresh address is incre- 


mented is programmable to allow for different bus 
widths and memory bank arrangements. 
1.1.6 CPU RESET FUNCTION 


The 82370 contains a special reset function which 
can respond to hardware reset signals as well as a 
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software reset command. The circuit will hold the 
80376’s RESET line active while an external hard- 
ware reset signal is present at its RESET input. It 
can also reset the 80376 processor as the result of a 
software command. The software reset command 
causes the 82370 to hold the processor’s RESET 
line active for a minimum of 62 clock cycles. The 
80376 requires that its RESET line be held active for 
a minimum of 80 clock cycles to re-initialize. For a 
more detailed explanation and solution, see Appen- 
dix D (System Notes). 


The 82370 can be programmed to sense the shut- 
down detect code on the status lines from the 
80376. If the Shutdown Detect function is enabled, 
the 82370 will automatically reset the processor. A 
diagnostic register is available which can be used to 
determine the cause of reset. 


1.1.7 REGISTER MAP RELOCATION 


After a hardware reset, the internal registers of the 
82370 are located in |/O space beginning at port 
address OOOOH. The map of the 82370’s registers is 
relocatable via a software command. The default 
mapping places the 82370 between I/O addresses 
0000H and OODBH. The relocation register allows 
_ this map to be moved to any even 256-byte bounda- 
ry in the processor’s 16-bit |/O address space or any 
even 64 kbyte boundary in the 24-bit memory ad- 
dress space. 


1.2 Host Interface 


The 82370 is designed to operate efficiently on the 
local bus of an 80376 microprocessor. The control 
signals of the 82370 are identical in function to 
those of the 80376. As a slave, the 82370 operates 
with all of the features available on the 80376 bus. 
When the 82370 is in the Master Mode, it looks iden- 
tical to an 80376 to the connected devices. 


The 82370 monitors the bus at all times, and deter- 
mines whether the current bus cycle is a pipelined or 
non-pipelined access. All of the status signals of the 
processor are monitored. | 


The control, status, and data registers within the 
82370 are located at fixed addresses relative to 
each other, but the group can be relocated to either 
memory or !/O space and to different locations with- 
in those spaces. 


As a Slave device, the 82370 monitors the control/ 
status lines of the CPU. The 82370 will generate all 
of the wait states it needs whenever it is accessed. 
This allows the programmer the freedom of access- 


82370 


ing 82370 registers without having to insert NOPs in 
the program to wait for slower 82370 internal regis- 
ters. 


The 82370 can determine if a current bus cycle is a 
pipelined or a non-pipelined cycle. It does this by 
monitoring the ADS #, NA# and READY# signals 
and thereby keeping track of the current state of the 
80376. 


As a bus master, the 82370 looks like an 80376 to 
the rest of the system. This enables the designer 
greater flexibility in systems which include the 
82370. The designer does not have to alter the inter- 
faces of any peripherals designed to operate with 
the 80376 to accommodate the 82370. The 82370 
will access any peripherals on the bus in the same 
manner as the 80376, including recognizing pipe- 
lined bus cycles. 


The 82370 is accessed as an 8-bit peripheral. The 
80376 places the data of all 8-bit accesses either on 
D(O—7) or D(8-15). The 82370 will only accept data 
on these lines when in the Slave Mode. When in the 
Master Mode, the 82370 is a full 16-bit machine, 
sending and receiving data in the same manner as 
the 80376. . 


2.0 80376 HOST INTERFACE 


The 82370 contains a set of interface signals to op- 
erate efficiently with the 80376 host processor. 
These signals were designed so that minimal hard- 
ware is needed to connect the 82370 to the 80376. 
Figure 2-1 depicts a typical system configuration 
with the 80376 processor. As shown in the diagram, 
the 82370 is designed to interface directly with the 
80376 bus. 


Since the 82370 resides on the opposite side of the 
data bus transceivers with respect to the rest of the 
system peripherals, it is important to note that the 
transceivers should be controlled so that contention 
between the data bus transceivers and the 82370 
will not occur. In order to ease the implementation of 
this, the 82370 activates the CHPSEL # signal which | 
indicates that the 82370 has been addressed and 
may output data. This signal should be included in 
the direction and enable control logic of the trans- 
ceiver. When any of the 82370 internal registers are 
read, the data bus transceivers should be disabled 
so that only the 82370 will drive the local bus. 


This section describes the basic bus functions of the 
82370 to show how this device interacts with the 
80376 processor. Other signals which are not direct- 
ly related to the host interface will be discussed in 
their associated functional block description. 
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Figure 2-1. 80376/82370 System Configuration 


2.1 Master and Siave Niodes 


At any time, the 82370 acts as either a Slave device 
or a Master device in the system. Upon reset, the 
82370 will be in the Slave Mode. In this mode, the 
80376 processor can read/write into the 82370 in- 
ternal registers. Initialization information may be pro- 
grammed into the 82370 during Slave Mode. 


When DMA service (including DRAM Refresh Cycles 
generated by the 82370) is requested, the 82370 will 
request and subsequently get control of the 80376 
local bus. This is done through the HOLD and HLDA 
(Hold Acknowledge) signals. When the 80376 proc- 


essor responds by asserting the HLDA signal, the 
82370 will switch into Master Mode and perform 
DMA transfers. In this mode, the 82370 is the bus 
master of the system. It can read/write data from/to 
memory and peripheral devices. The 82370 will re- 
turn to the Slave Mode upon completion of DMA 
transfers, or when HLDA is negated. 


2.2 80376 Interface Signals 


As mentioned in the Architecture section, the Bus ~ 
Interface module of the 82370 (see Figure 1-1) con- 
tains signals that are directly connected to the 
80376 host processor. This module has separate 
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16-bit Data and 24-bit Address busses. Also, it has 
additional control signals to support different bus op- 
erations on the system. By residing on the 80376 
local bus, the 82370 shares the same address, data 
and control lines with the processor. The following 
subsections discuss the signals which interface to 
the 80376 host processor. 


2.2.1 CLOCK (CLK2) 


The CLK2 input provides fundamental timing for the 
82370. It is divided by two internally to generate the 
82370 internal clock. Therefore, CLK2 should be 
driven with twice the 80376’s frequency. In order to 
maintain synchronization with the 80376 host proc- 
essor, the 82370 and the 80376 should share a 
common clock source. 

The internal clock consists of two phases: PHI1 and 
_ PHI2. Each CLK2 period is a phase of the internal 
clock. PHI2 is usually used to sample input and set 
up internal signals and PHI1 is for latching internal 
data. Figure 2-2 illustrates the relationship of CLK2 
and the 82370 internal clock signais. The CPURST 
signal generated by the 82370 guarantees that the 
80376 will wake up in phase with PHI7. 


2.2.2 DATA BUS (Do-D45) 


This 16-bit three-state bidirectional bus provides a 
general purpose data path between the 82370 and 
the system. These pins are tied directly to the corre- 
sponding Data Bus pins of the 80376 local bus. The 
Data Bus is also used for interrupt vectors generated 
by the 82370 in the Interrupt Acknowledge cycle. 


During Slave I/O operations, the 82370 expects a 
single byte to be written or read. When the 80376 
host processor writes into the 82370, either Do—D7 
or Dg—D45 will be latched into the 82370, depending 


82370 CLOCK PERIOD 
CLK2 PERIOD 


82370 CLOCK PERIOD 
~ CLK2 PERIOD 


upon whether Byte Enable bit BLE# is O or 1 (see 
Table 2-1). When the 80376 host processor reads 
from the 82370, the single byte data will be duplicat- 
ed twice on the Data Bus; i.e. on Do—D7 and Dg- 
D465. 


During Master Mode, the 82370 can transfer 16-, 
and 8-bit data between memory (or I/O devices) and 
1/O devices (or memory) via the Data Bus. 


2.2.3 ADDRESS BUS (Ao3~Aj) 


These three-state bidirectional signals are connect- 
ed directly to the 80376 Address Bus. In the Slave 
Mode, they are used as input signals so that the 
processor can address the 82370 internal ports/reg- 
isters. In the Master Mode, they are used as output 
signals by the 82370 to address memory and periph- 
eral devices. The Address Bus is capable of ad- 
dressing 16 Mbytes of physical memory space 
(OOOOOOH to FFFFFFH), and 64 Kbytes of !/O ad- 
dresses. 


2.2.4 BYTE ENABLE (BHE#, BLE #) 


The Byte Enable pins BHE# and BLE# select the 
specific byte(s) in the word addressed by A,—Aod3. 
During Master Mode operation, it is used as an out- 
put by the 82370 to address memory and I/O loca- 
tions. The definition of BHE# and BLE# is further 
illustrated in Tabie 2-1. : 


NOTE: 
The 82370 will activate BHE# when output in Mas- 
ter Mode. For a more detailed explanation and its 
solutions, see Appendix D (System Notes). 


82370 CLOCK PERIOD 
— CLK2 PERIOD 


290164-9 


Figure 2-2. CLK2 and 82370 Internal Clock 
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Table 2-1. Byte Enable Signals 


Byte to be Accessed 


Logical Byte Presented on 


BHE # BLE # ~ Data Bus During WRITE Only* 
| Relative to Az3-A | 
| - | abe Dis-Dg - D7-Do 
0 0 0, 1 
0 4 1 
1 0 0 | 
1 i. (Not Used) — 
U = Undefined | a4 
A = Logical Dg-D7 
B = Logical Dg-Dy5 
-*NOTE: 


Actual number of Bias: accessed depends upon the siourainiied dats path width. 


‘Table 2-2. Bus Cycle Definition 


2.2.5 BUS CYCLE DEFINITION SIGNALS 
(D/C #, W/R#, M/IO#) 


_ These three-state bidirectional signals define the 
type of bus cycle being performed. W/R# distin- 
guishes between write and read cycles. D/C# dis- 
tinguishes between processor data and control cy- 
cles. M/IO# distinguishes between memory and I/O 
cycles. 


During Slave Mode, these signals are driven by the 
80376 host processor; during Master Mode, they are 
driven by the 82370. In either mode, these signals 
will be valid when the Address Status (ADS#) is 
driven LOW. Exact bus cycle definitions are given in 
Table 2-2. Note that some combinations are recog- 
nized as inputs, but not generated as outputs. In the 
Master Mode, D/C# is always HIGH. 


2.2.6 ADDRESS STATUS (ADS #) 


This signal indicates that a valid address (Aj—Ao3, — 


BHE#, BLE#) and bus cycle definition (W/R#, 
D/C#, M/IO#) is being driven on the bus. In the 


Master Mode, it is driven by the 82370 as an output. . 


In the Slave Mode, this signal is monitored as 


As OUTPUTS 


NOT GENERATED 

NOT GENERATED 
1/0 Read 

I/O Write 

NOT GENERATED 

NOT GENERATED 


anes Acknowledge 
UNDEFINED 

1/O Read 

I/O Write. 
UNDEFINED 

HALT if Ay = 1 
SHUTDOWN if A; = 0 
Memory Read 

walla Write 


Memory Read 
| Bowed Write 


an input 6 the 82370. By the current and past 
status of ADS# and the READY # input, the 82370 
is. able to determine, during Slave Mode, if the next 
bus cycle is a pipelined address cycle. ADS # is as- 
serted during T1 and T2P bus States (see Bus State 
Definition). | 


NOTE: | 
ADS # must be qualified with the rising edge of 
CLk2. | 


2.2.7 TRANSFER ACKNOWLEDGE (READY #) 


This input indicates that the current bus cycle is 
complete. In the Master Mode, assertion of this sig- 
nal indicates the end of a DMA bus cycle. In the 
Slave Mode, the 82370 monitors this input and 
ADS # to detect a pipelined address cycle. This sig- 

_ nal should be tied directly to the READY # input of 
the 80376 host processor. 


2.2.8 NEXT ADDRESS REQUEST (NA#) 


This input is used to indicate to the 82370 in the 
Master Mode that the system is requesting address 
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pipelining. When driven LOW by either memory or 
peripheral devices during Master Mode, it indicates 
that the system is prepared to accept a new address 
and bus cycie definitior: signals from the 82370 be- 
fore the end of the current bus cycle. If this input is 
active when sampied by the 82370, the next address 
is driven: onto the bus, provided a dus +equest is 
aiready pending internally. 


This input pin is rionitorec oniy in the Master Mode. 
In the Slave Mode, the 82370 uses the ADS# and 
READY # signais io determine address pipelining 
cycies, and NA# will be ignored. 


2.2.9 RESET (RESET, CPURST) 
RESET __ 


This synchronous input suspends any operation in 
progress and piaces the 82370 in a known initial 
state. Upori reset, the 82370 will be in the Slave 
Mode waiting to be initialized by the 80376 host 
processor. The 82370 is reset by asserting RESET 
for 15 or more CLK2 periods. When RESET is as- 
serted, ali other input pins are ignored, and all other 
bus pins are driven to an idle bus state as shown in 
Table 2-3. The 82370 will determine the phase of its 
internal clock following RESET going inactive. 


RESET is level-sensitive and must be synchronous 


to the CLK2 signal. The RESET setup and hold time 
requirements are shown in Figure 2-3. 
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*NOTE: 
The interrupt Coriroiler and Prograrrimable interval Timer 
are initialized by software comiviarids. 


CPURST 


This output signal is used to reset the 80376 host 
processor. It will go active (HIGH) whenever one of 
the following events occurs: a) 82370’s RESET input 
is active; b) a software RESET command is issued 
to the 82370; or c) when the 82370 detects a proc- 
essor Shutdown cycie and when this detection fea- 
ture is enabled (see CPU Reset and Shutdown De- 
tect). When activated, CPURST will be held active 
for 62 clocks. The timirig of CPURST is such that the 
80376 processor will be in synchronization with the 
82370. This timing is shown in Figure 2-4. 


PHI2 | PHI 1 | PHI 2 
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2.2.10 INTERRUPT OUT (INT) «= - 


This output pin is used to signal the 80376 host 
processor that one or more interrupt requests (either 
internal or external) are pending. The processor is 
expected to respond with an Interrupt Acknowledge 
cycle. This signal should be connected directly to 
the Maskable Interrupt Request (INTR) input of the 
80376 host processor. 


2.3 82370 Bus Timing 


The 82370 internally divides the CLK2 signal by two 
to generate its internal clock. Figure 2-2 showed the 
relationship of CLK2 and the -internal clock which 
consists of two phases: PHI1 and PHI2. Each CLK2 
period is a phase of the internai clock. 


In the 82370, whether it is in the Master or Slave 
Mode, the shortest time unit of bus activity is a bus 
state. A bus state, which is also referred as a 
‘T-state’, is defined as one 82370 PHI2 clock period 
(i.e. two CLK2 periods). Recall in Table 2-2 various 
types of bus cycles in the 82370 are defined by the 
M/IO#, D/C# and W/R# signals. Each of these 
bus cycles is composed of two or more bus :states. 
The length of a bus cycle depends on when the 
READY # input is asserted (i.e. driven LOW). 


2.3.1 ADDRESS PIPELINING 


The 82370 supports Address Pipelining as an option 
in both the Master and Slave Mode. This feature typ- 
ically allows a memory or peripheral device to oper- 
ate with one less wait state than would otherwise be 
required. This is possible because during a pipelined 
cycle, the address and bus cycle definition of the 
next cycle will be generated by the bus master while 
_ waiting for the end of the current cycle to be ac- 
knowledged. The pipelined bus is especially well 
suited for an interleaved memory environment. For 
16 MHz interleaved memory designs with 100 ns ac- 
cess time DRAMs, zero wait state memory accesses 
can be achieved when pipelined addressing is se- 
lected. 


In the Master Mode, the 82370 is capable of initiat- 
ing, on a cycle-by-cycle basis, either a pipelined or 
non-pipelined access depending upon the state of 


the NA# input. If a pipelined cycle is requested (indi- . 


cated by NA# being driven LOW), the 82370 will 
drive the address and bus cycle definition of the next 


cycle as soon as there is an internal bus request 


pending. 


In the Slave Mode, the 82370 is constantly monitor- 
ing the ADS# and READY # signals on the proces- 
sor local bus to determine if the current bus cycle is 
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a pipelined cycle. If a pipelined cycle is detected, the 
82370 will request one less wait state from the proc- 
essor if the Wait State Generator feature is selected.. 
On the other hand, during an 82370 internal register 
access in a pipelined cycle, it will make use of the 
advance address and bus cycle information. In all 
cases, Address Pipelining will result in a savings oh 
one wait state. 


2.3.2 MASTER MODE BUS TIMING 


When the 82370 is in the Master Mode, it will be in 
one of six bus states. Figure 2-5 shows the complete 
bus state diagram of the Master Mode, including 
pipelined address states. As seen in the figure, the 


82370 state diagram is very similar to that of the 


80376. The major difference is that in the 82370, 
there is no Hold state. Also, in the 82370, the condi- 
tions for some state transitions depend upon wheth- 

er it is the end of a DMA process. 


NOTE: 
The term ‘end of a DMA process’ is loosely defined 
here. It depends on the DMA modes of operation 
as well as the state of the EOP# and DREQ in- 
puts. This is expained in detail in section 3—DMA 


Controller. 


The 82370 will enter the idle state, Ti, upon RESET 
and whenever the internal address is not available at 
the end of a DMA cycle or at the end of a DMA 
process. When address pipelining is not used (NA# 
is not asserted), a new bus cycle always begins with 
state T1. During T1, address and bus cycle definition 
signals will be driven on the bus. T7 is always fol- 
lowed by T2. 


If a bus cycle is not acknowledged (with READY #) 
during T2 and NA# is negated, T2 will be repeated. 
When the end of the bus cycle is acknowledged dur- 
ing T2, the following state will be 11 of the next bus 
cycle (if the internal address latch is loaded and if 
this is not the end of the DMA process). Otherwise, 
the Ti state will be entered. Therefore, if the memory 
or peripheral accessed is fast enough to respond 
within the first T2, the fastest non-pipelined cycle will 
take one Ti and one T2 state. 


Use of the address pipelining feature allows the 


- 82370 to enter three additional bus states: T1P, T2P 


and T2i. T1P is the first bus state of a pipelined bus 
cycle. T2P follows T1P (or T2) if NA# is asserted 
when sampled. The 82370 will drive the bus with the 
address and bus cycle definition signals of the next 
cycle during T2P. From the state diagram, it can be 
seen that after an idle state Ti, the first bus cycle 
must begin with T1, and is therefore a non-pipelined - 
bus cycle. The next bus cycle can be pipelined if 
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NA# is asserted and the previous bus cycle ended 
in a T2P state. Once the 82370 is in a pipelined 
cycle and provided that NA# is asserted in subse- 
quent cycles, the 82370 will be switching between 
T1P and T2P states. If the end of the current bus 
cycle is not acknowledged by the READY # input, 
the 82370 will extend the cycle by adding T2P 
states. The fastest pipelined cycie will consist of one 
T1P and one T2P state. 


The 82370 will enter state T2i when NA# is assert- 
ed and when one of the following two conditions 
occurs. The first condition is when the 82370 Is in 
state T2. T2i will be entered if READY # is not as- 
serted and there is no next address available. This 
situation is similar to a wait state. The 82370 will stay 
in T2i for as long as this condition exists. The sec- 
ond condition which will cause the 82370 to enter 
T2i is when the 82370 is in state T1P. Before going 
to state T2P, the 82370 needs to wait in state T2i 
until the next address is available. Also, in both cas- 
es, if the DMA process is complete, the 82370 will 
enter the T2i state in order to finish the current DMA 
cycle. 


Figure 2-6 is a timing diagram showing non-pipelined 
bus accesses in the Master Mode. Figure 2-7 shows 
the timing of pipelined accesses in the Master Mode. 


READY# Asserted. [Not ADAV + End of DMA] 


ADAV. 


Not ADAY 


ADAV. READY# Asserted / 


Not ADAV + End of DMA] 


READY# Asserted. 


Not ADAV. READY# Negated 


NOTE: 
ADAV —-iriternail Address Available 


_ Figure 2-5. Master Mode State Diagram 


READY# Negated. 
NA# Negated 


" READY# Asserted.™ 
Not End of DMA 


READY# Negated. 
NA# Asserted. 
[End of DMA+Not ADAV | 
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2.3.3 SLAVE MODE BUS TIMING 


Figure 2-8 shows the Slave Mode bus timing in both 
pipelined and non-pipelined cycles when the 82370 
is being accessed. Recall that during Siave Mode, 
the 82370 will constantly monitor the ADS# and 
READY # signals to determine if the next cycle is 
pipelined. In Figure 2-8, the first cycle is non-pipe- 
lined and the second cycle is pipelined. In the pipe- 
lined cycle, the 82370 will start decoding the ad- 
dress and bus cycle signals one bus state earlier 
than in a non-pipelined cycle. | 


The READY # input signal is sampled by the 80376 


host processor to determine the completion of a bus 
cycle. This occurs during the end of every T2, T2i 
and T2P state. Normally, the output of the 82370 
Wait State Generator, READYO#, is directly con- 
nected to the READY# input of the 80376 host 
processor and the 82370. In such case, READYO # 
and READY # will be identical (see Wait State Gen- 
erator). 


NA# Negated 


NA# Asserted. 
Not End of DMA 


READY# Asserted 


ADAV. 
READY# Negated. 
NA# Asserted. 
Not End of DMA 


_ READY# Negated | 


' ADAV. READY# Negated » 


NA# Asserted. [Not ADAV+End of DMA] _ 
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Figure 2-6. Non-Pipelined Bus Cycles 
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Figure 2-7. Pipelined Bus Cycles 
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NA# is shown here only for timing reference. It is not sampled by the 82370 during Slave Mode. 
When the 82370 registers are accessed, it will take one or more wait states in pipelined and two or more wait states in 


non-pipelined cycle to complete the internal access. 


Figure 2-8. Slave Read/Write Timing 


3.0 DMA CONTROLLER 


The 82370 DMA Controller is capable of transferring 
data between any combination of memory: and/or 
[/O, with any combination of data path widths. The 
82370 DMA Controller can be programmed to ac- 
commodate 8- or 16-bit devices. With its 16-bit ex- 
ternal data path, it can transfer data in units of byte 
or a word. Bus bandwidth is optimized through the 
use of an internal temporary register which can dis- 
assemble or assemble data to or from either an 
aligned or non-aligned destination or source. Figure 
3-1 is a block diagram of the 82370 DMA Controller. | 


The 82370 has eight channels of DMA. Each chan- 
nel operates independently of the others. Within the 
operation of the individual channels, there are many 
different modes of data transfer available. Many of 
the operating modes can be intermixed to provide a 
very versatile DMA controller. 


3.1 Functional Description 


In describing the operation of the 82370’s DMA Con- 
troller, close attention to terminology is required. Be- 
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COMMAND REGISTER IL | BYTE COUNT | BYTE COUNT] REGISTER: 
MODE REGISTER I BASE CURRENT | 
: as REQUESTER | REQUESTER | 
| MODE REGISTER IT ADDRESS ADDRESS 
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CURRENT _ 
TARGET 
| ADDRESS. 


CHANNEL 3 (SAME AS CH 0) 


- "UPPER" GROUP OF CHANNELS 


CONTROL/STATUS | 


(SAME AS 
LOWER GROUP) 


fore entering the discussion of the function of the 
82370 DMA Controller, the following explanations of 
some of the terminology used herein may be of ben 
efit. First, a few terms for clarification: | 


DMA PROCESS-—A DMA process is the execution 
of a programmed DMA task from beginning to end. 
Each DMA process requires intitial programming by 
_ the host 80376 microprocessor. 


BUFFER—A contiguous biock of data. 


BUFFER TRANSFER—The action required by th 
DMA to transfer an entire buffer. , 


~DATA TRANSFER—The DMA action in which a 
group of bytes or words are moved between devices 
by the DMA Controller. A data transfer operation 
may involve movement of one or many bytes. 


BUS CYCLE—Access by the DMA to a single byt 
or word. : | | 


Each DMA channel consists of three major compo- 
nents. These components are identified by the con- 
tents of programmable registers which define the 


Figure 3-1. 82370 DMA Controller Block Diagram 
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memory or |/O devices being serviced by the DMA. 
They are the Target, the Requester, and the Byte 


Count. They will be defined generically here and in 


greater detail in the DMA register definition section. 


The Requester is the device which requires service 
by the 82370 DMA Controller, and makes the re- 
quest for service. All of the control signals which the 
DMA monitors or generates for specific channels 
are logically related to the Requester. Only the Re- 
quester is considered capable of initiating or termi- 
nating a DMA process. : . 


The Target is the device with which the Requester 
wishes to communicate. As far as the DMA process 
is concerned, the Target is a slave which is incapa- 
ble of control over the process. 7 


The direction of data transfer can be either from Re- 
quester to Target or from Target to Requester; i.e. 
each can be either a source or a destination. 


The Requester and Target may each be either |/O 
or memory. Each has an address associated with it 
that can be incremented, decremented, or held con- 
stant. The addresses are stored in the Requester 
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Address Registers and Target Address Registers, 
respectively. These registers have two parts: one 
which contains the current address being used in the 
DMA process (Current Address Regisier), and one 
which holds the programmed base address (Base 
Address Register). The contents of the Base Regis- 
ters are never changed by the 82370 DMA Control- 
ler. The Current Registers are incremented or decre- 
mented according to the progress of the DMA pro- 
cess. 


The Byte Count is the component of the DMA pro- 
cess which dictates the amount of data which must 
be transferred. Current and Base Byte Count Regis- 
ters are provided. The Current Byte Count Register 
is decremented once for each byte transferred by 
the DMA process. When the register is decremented 
past zero, the Byte Count is considered ‘expired’ 
and the process is terminated or restarted, depend- 
ing on the mode of operation of the channel. The 
point at which the Byte Count expires is called “Ter- 
minal Count’ and several status signals are depen- 
dent on this event. 


Each channel of the 82370 DMA Controller also 
contains a 32-bit Temporary Register for use in as- 
sembling and disassembling non-aligned data. The 
operation of this register is transparent to the user, 
although the contents of it may affect the timing of 
some DMA handshake sequences. Since there is 
data storage availiable for each channel, the DMA 
Controller can be interrupted without loss of data. 


To avoid unexpected results, care should be taken 
in programming the byte count correctly when as- 
sembing and disassembling non-aligned data. For 
example: 


Words to Bytes: 

Transferring two words to bytes, but setting the byte 
count to three, will result in three bytes transferred 
and the final byte flushed. , 


Bytes to Words: 
Transferring six bytes to three words, but setting the 
byte count to five, will result in the sixth byte trans- 


ferred being undefined. 


The 82370 DMA Controller is a siave on the bus until 
a request for DMA service is received via either a 
software request command or a hardware request 
signal. The host processor may access any of the 
control/status or channel registers at any time the 
82370 is a bus slave. Figure 3-2 shows the flow of 
operations that the DMA Controller performs. 


At the time a DMA service request is received, the 
DMA Controller issues a bus hold request to the 


_ host processor. The 82370 becomes the bus master 


when the host relinquishes the bus by asserting a 
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hold acknowledge signal. The channel to be serv- 
iced will be the one with the highest priority at the 
time the DMA Controller becomes the bus master. 
The DMA Controller will remain in control of the bus 
until the hold acknowledge signal is removed, or un- 
til the current DMA transfer is complete. 


While the 82370 DMA Controller has control of the 
bus, it will perform the required data transfer(s). The 
type of transfer, source and destination addresses, 
and amount of data to transfer are programmed in 
the control registers of the DMA channel which re- 
ceived the request for service. 


At completion of the DMA process, the 82370 will 
remove the bus hold request. At this time the 82370 


becomes a siave again, and the host returns to be- 


ing a master. If there are other DMA channels with 


_ requests pending, the controller will again assert the 


hold request signal and restart the bus arbitration 
and switching process. 


WAIT FOR DMA 
SERVICE REQUEST NO REQUEST 
REQUEST PENDING | 


ASSERT BUS HOLD 
janis 


BUS HOLD ACKNOWLEDGED 


ARBITRATE 
PENDING REQUESTS 


p EXECUTE HIGHEST § 
§ PRIORITY TRANSFER | 


| DE=ASSERT BUS | 
| HOLD REQUEST 
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Figure 3-2. 


Fiow of DMA Controller Operation 


3.2 Interface Signals — 


There are fourteen control signais dedicated to the 
DMA process. They include eight DMA Channel Re- 
quests (DREQn), three Encoded DMA Acknowledge 
signals (EDACKn), Processor Hold and Hold Ac- 
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Figure 3-3. Requester, Target and DMA Controller Interconnection 


knowledge (HOLD, HLDA), and End-of-Process 
(EOP #). The DREQn inputs and EDACK (0-2) out- 
puts are handshake signals to the devices requiring 
DMA service. The HOLD output and HLDA input are 
handshake signals to the host processor. Figure 3-3 
shows these signals and how they interconnect be- 


tween the 82370 DMA Controller, and the neon 


and Target devices. 


3.2.1 DREQn and EDACK (0-2) 


These signals are the handshake signals between 
the peripheral and the 82370. When the peripheral 


requires DMA service, it asserts the DREQn signal. 


of the channel which is programmed to perform the 
service. The 82370 arbitrates the DREQn against 
other pending requests and begins the DMA pro- 
cess after finishing other higher priority processes. 


When the DMA service for the requested channel is 
in progress, the EDACK (0-2) signals represent the 
DMA channel which is accessing the Requester. 


The 3-bit code on the EDACK (0-2) lines indicates" 


the number of the channel presently being serviced. 
Table 3-2 shows the encoding of these signals. Note 
that Channel 4 does not have a corresponding hard- 
_ ware acknowledge. 


The DMA acknowledge (EDACK) signals indicate 
the active channel only during DMA accesses ito the 
Requester. During accesses to the Target, EDACK 
(0-2) has the idle code (100). EDACK (0-2) can 
thus be used to select a Requester device during a 
transfer. . 


DREQn can be programmed as either an Asynchro- 
nous or Synchronous input. See section 3.4.1 for de- 
_tails on synchronous versus asynchronous operation 
of these pins. 
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"Table 3-2. EDACK Encoding - 
During a DMA Transfer 


: 

2 

3 
Target Access 


0 
0 
0 
0 
1 
1 
1 
1 


—-~ =H OO =| — © OQ 
_-_oOoO-0O—- Oo + © 


The EDACKn signals are always active. They either. 
indicate ‘no acknowledge’ or they indicate a bus ac- 
cess to the requester. The acknowledge code is ei- 
ther 100, for an idle DMA or during a DMA access to 
the Target, or ‘n’ during a Requester access, where 
n is the binary value representing the channel. A 
simple 3-line to 8-line decoder can be used to pro- 
vide discrete ee signals for the peripher- 
als. 


3.2.2 HOLD AND HLDA 


The Hold Request (HOLD) and Hold Acknowledge 
(HLDA) signals are the handshake signals between 
the DMA Controller and the host processor. HOLD is 
an output from the 82370 and HLDA is an input. 
HOLD is asserted by the DMA Controller when there 
is a pending DMA request, thus requesting the proc- 
essor to give up control of the bus so the DMA pro- 
cess can take place. The 80376 responds by assert- 
ing HLDA when it is ready to relinquish control of the 
bus. | = 
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The 82370 will begin operations on the bus one 
clock cycle after the HLDA signal goes active. For 
this reason, other devices on the bus should be in 
the slave mode when HLDA is active. 


HOLD and HLDA should not be used to gate or se- 
lect peripherals requesting DMA service. This is be- 
cause of the use of DMA-like operations by the 
DRAM Refresh Controller. The Refresh Controller is 
arbitrated with the DMA Controller for control of the 
bus, and refresh cycles have the highest priority. A 
refresh cycle will take place between DMA cycles 
without relinquishing bus control. See section 3.4.3 
for a more detailed discussion of the interaction be- 
tween the DMA Controller and the DRAM Refresh 
Controller. 


3.2.3 EOP # 


EOP # is a bi-directional signal used to indicate the 
end of a DMA process. The 82370 activates this as 
an output during the T2 states of the last Requester 
bus cycle for which a channel is programmed to exe- 
cute. The Requester should respond by either with- 
drawing its DMA request, or interrupting the host 
processor to indicate that the channel needs to be 
programmed with a new buffer. As an input, this sig- 
nal is used to teli the DMA Controller that the periph- 
eral being serviced does not require any more data 
_to be transferred. This indicates that the current 
buffer is to be terminated. 


EOP# can be programmed as either an Asynchro- 
nous or a Synchronous input. See section 3.4.1 for 
details on synchronous versus asynchronous opera- 
tion of this pin. 


3.3 Modes of Operation 


The 82370 DMA Controller has many independent 
operating functions. When designing peripherai in- 
terfaces for the 82370 DMA Controller, all of the 
functions or modes must be considered. All of the 
channels are independent of each other (except in 
priority of operation) and can operate in any of the 
modes. Many of the operating modes, though inde- 
pendently programmable, affect the operation of 
other modes. Because of the large number of com- 
binations possible, each programmable mode is dis- 
cussed here with its affects on the operation of other 
modes. The entire list of possible combinations will 
not be presented. | 


Table 3-1 shows the categories of DMA features 


available in the 82370. Each of the five major cate- 
gories is independent of the others. The sub-catego- 
ries are the available modes within the major func- 
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Table 3-1. DMA Operating Modes 


l. TARGET/REQUESTER DEFINITION 
a. Data Transfer Direction 
b. Device Type 

ll. BUFFER PROCESSES 
a. Single Buffer Process 
b. Buffer Auto-Initialize Process 
c. Buffer Chaining Process 

. DATA TRANSFER/HANDSHAKE MODES 
a. Single Transfer Mode | 
b. Demand Transfer Mode 
c. Block Transfer Mode 
d. Cascade Mode 

. PRIORITY ARBITRATION 
a. Fixed 
b. Rotating 
c. Programmable Fixed 

. BUS OPERATION 
a. Fly-By (Single-Cycle)/Two-Cycle 
b. Data Path Width 
c. Read, Write, or Verify Cycles 


tion or mode category. The following sections ex- 
plain each mode or function and its relation to other 
features. , 


3.3.1 TARGET/REQUESTER DEFINITION 


All DMA transfers involve three devices: the DMA 
Controller, the Requester, and the Target. Since the 
devices to be accessed by the DMA Controller vary 
widely, the operating characteristics of the DMA 
Controller must be tailored to the Requester and 
Target devices. 


The Requester can be defined as either the source 
or the destination of the data to be transferred. This 
is done by specifying a Write or a Read transfer, 
respectively. In a Read transfer, the Target is the 
data source and the Requester is the destination for 
the data. In a Write transfer, the Requester is the 
source and the Target is the destination. 


The Requester and Target addresses can each be 
independently programmed to be incremented, dec- 
remented, or held constant. As an example, the 
82370 is capable of reversing a string of data by 
having the Requester address increment and the 
Target address decrement in a memory-to-memory 
transfer. 
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3.3.2 BUFFER TRANSFER PROCESSES 


The 82370 DMA Controller allows three programma- 
ble Buffer Transfer Processes. These processes de- 
fine the logical way in which a buffer of data is ac- 
cessed by the DMA. 


The three Buffer Transfer Processes include the Sin- 
gle Buffer Process, the Buffer Auto-Initialize Pro- 
cess, and the Buffer Chaining Process. These pro- 
cesses require special programming considerations. 
See the DMA Programming section for more details 
on setting up the Buffer Transfer Processes. 


Single Buffer Process | 


The Single Buffer Process allows the DMA channel 
to transfer only one buffer of data. When the buffer 
has been completely transferred (Current Byte 
Count decremented past zero or EOP# input ac- 
tive), the DMA process ends and the channel be- 
comes idle. In order for that channel to sa} used 
again, it must be reprogrammed. | 


The Single Buffer Process is usually used when the 
amount of data to be transferred is known exactly, 
and it is also known that there is not likely to be any 
data to follow before the operating system can re- 
program the channel. 


Buffer Auto-Initialize Process 


. The Buffer Auto-Initialize Process allows multiple 

_ groups of data to be transferred to or from a single 
buffer. This process does not require reprogram- 
ming. The Current Registers are automatically repro- 
grammed from the Base Registers when the current 
process is terminated, either by an expired Byte 
Count or by an external EOP# signal. The data 
transferred will always be between the same Target 
and Requester. 


The auto-initialization/ process-execution cycle is re- 
peated until the channel is either disabled or re-pro- 
grammed. 


Buffer Chaining Process 


The Buffer Chaining Process is useful for transfer- 
ring large quantities of data into non-contiguous 
buffer areas. In this process, a single channel is 
used to process data from several buffers, while 
having to program the channel only once. Each new 
buffer is programmed in a pipelined operation that 
provides the new buffer information while the old 
buffer is being processed. The chain is created by 
loading new buffer information while the 82370 DMA 
Controller is processing the Current Buffer. When 
_ the Current Buffer. expires, the 82370 DMA Control- 
ler automatically restarts the channel using the new 
buffer information. 
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Loading the new buffer information is -done by an 
interrupt routine which is requested by the 82370. 
Interrupt Request 1 (IRQ1) is tied internally to the 


- 82370 DMA Controller for this purpose. IRQ1 is gen- 


erated by the 82370 when the new buffer informa- 
tion is loaded into the channel's Current Registers, 
leaving the Base Registers ‘empty’. The interrupt. 
service routine loads new buffer information into the 
Base Registers. The host processor is required to 
load the information for another buffer before the 
current Byte Count expires. The process repeats un- ° 
til the host programs the channel back to single buff- 


er operation, or until the channel runs out of buffers. 


The channel runs out of buffers when the Current 
Buffer expires and the Base Registers have not yet 
been loaded with new buffer information. When this 
occurs, the channel must be reprogrammed. 


lf an external EOP # is encountered while executing 
a Buffer Chaining Process, the current buffer is con- 
sidered expired and the new buffer information is 
loaded into the Current Registers. If the Base Regis- 
ters are ‘empty’, the chain is terminated. 


The channel uses the Base Target Address Register 


~ as an indicator of whether or not the Base Registers 


are full. When the most significant byte of the Base 
Target Register is loaded, the channel considers all 
of the Base Registers loaded, and removes the in- 


- terrupt request. This requires that the other Base 


Registers (Base Requester Address, Base Byte 
Count) must be loaded before the Base Target Ad- 
dress Register. The reason for implementing the re- 
loading process this way is that, for most applica- 
tions, the Byte Count and the Requester will not 
change from one buffer to the next, and therefore do 
not need to be reprogrammed. The details of pro- 
gramming the channel for the Buffer Chaining Pro- 
cess can be found in the section on DMA program- 
ming. | 


3.3.3 DATA TRANSFER MODES 
Three Data Transfer modes are available in the 


82370 DMA Controller. They are the Single Transfer, 
Block Transfer, and Demand Transfer Modes. 


_ These transfer modes can be used in conjunction 


with any. one of three Buffer Transfer modes: Single 
Buffer, Auto-Initialized Buffer and Buffer Chaining. 
Any Data Transfer Mode can be used under any of 
the Buffer Transfer Modes. These modes are inde- 
pendently available for all DMA channels. he 


Different devices being serviced by the DMA Con- 
troller require different handshaking sequences for 
data transfers to take place. Three handshaking 
modes are available on the 82370, giving the de- 
signer the opportunity to use the DMA Controller as: 
efficiently as possible. The speed at which data can 
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be presented or read by a device can affect the way 
a DMA Controller uses the host’s bus, thereby af- 
fecting not only data throughput during the DMA pro- 
cess, but also affecting the host’s performance by 
limiting its access to the bus. 


Single Transfer Mode 


In the Single Transfer Mode, one data transfer to or 
from the Requester is performed by the DMA Con- 
troller at a time. The DREQn input is arbitrated and 
the HOLD/HLDA sequence is executed for each 
transfer. Transfers continue in this manner until the 
Byte Count expires, or until EOP # is sampled active. 
If the DREQn input is held active continuously, the 
entire DREQ-HOLD-HLDA-DACK sequence is re- 
peated over and over until the programmed number 
of bytes has been transferred. Bus control is re- 
leased to the host between each transfer. Figure 3-4 
shows the logical flow of events which make up a 
buffer transfer using the Single Transfer Mode. Re- 
fer to section 3.4 for an explanation of the bus con- 
trol arbitration procedure. 


The Single Transfer Mode is used for devices which 
require complete handshake cycles with each data 
access. Data is transferred to or from the Requester 
only when the Requester is ready to perform the 
transfer. Each transfer requires the entire DREQ- 


pot by be het 
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HOLD-HLDA-DACK handshake cycle. Figure 3-5 
shows the timing of the Single Transfer Mode cycle. 


INITIALIZE BUFFER 


WAIT FOR DREQn sf 
OR SOFTWARE REQUEST 


EXECUTE 


ONE REQUESTER 
TRANSFER, 


END OF BUFFER 
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Figure 3-4. Buffer Transfer 
in Single Transfer Mode 
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NOTE: 


The Single Transfer Mode is more efficient (15%-—20%) in the case where the source is the Target. Because of the 
internal pipeline of the 82370 DMA Controller, two idle states are added at the end of a transfer in the case where the 


source is the Requester. 


Figure 3-5. DMA Single Transfer Mode 
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Block Transfer Mode — 


WAIT FOR DREQn | 


In the Block Transfer Mode, the DMA process is ini- 
tiated by a DMA request and continues unti the Byte 
Count expires, or until EOP # is activated by the Re- 
quester. The DREQn signal need only be held active 
until the first Requester access. Only a refresh cycle 
will interrupt the block transfer process. 


TRANSFER DATA UNTIL 


Figure 3-6 illustrates the operation of the DMA dur- nade 

ing the Block Transfer Mode. Figure 3-7 shows the : : 

timing of the handshake signals during Block Mode END OF BUFFER re 
Transfers. | oeher 


Figure 3-6. Buffer Transfer 
in Biock Transfer Mode 
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Figure 3-7, Block Mode Transfers 
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Demand Transfer Mode 


The Demand Transfer Mode provides the most flex- 


ible handshaking procedures during the DMA pro- 
cess. A Demand Transfer is initiated by a DMA re- | 
quest. The process continues until the Byte Count 

expires, or an external EOP # is encountered. If the 
device being serviced (Requester) desires, it can in- 
terrupt the DMA process by de-activating the 
DREQn line. Action is taken on the condition of | DREOn DE-ACTIVATED 
DREQn during Requester accesses only. The ac- OR EOP OR TC 
cess during which DREQn is sampled inactive is the 
last Requester access which will be performed dur- 
ing the current transfer. Figure 3-8 shows the flow of 
events during the transfer of a buffer in the Demand 
Mode. 


TRANSFER DATA UNTIL 


When the DREQnh line goes inactive, the DMA Con- | END OF BUFFER penta 
troller will complete the current transfer, including 

any necessary accesses to the Target, and relin- Figure 3-8. Buffer Transfer 

quish control of the bus to the host. The current pro- in Demand Transfer Mode 


cess information is saved (byte count, Requester 
and Target addresses, and Temporary Register). 


The Requester can restart the transfer process by 
reasserting DREQn. The 82370 will arbitrate the re- 
quest with other pending requests and begin the 
process where it left off. Figure 3-9 shows the timing 
of handshake signals during Demand Transfer Mode 
operation. 
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Figure 3-9. Demand Mode Transfers 
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Using the Demand Transfer Mode allows peripherals 
to access memory in small, irregular bursts without 
wasting bus control time. The 82370 is designed to 
give the best possible bus control latency in the De- 
mand Transfer Mode. Bus control latency is defined 
here as the time form the last active bus cycle of the 


previous bus master to the first active bus cycle of 


the new bus master. The 82370 DMA Controller will 
perform its first bus access cycle two bus states af- 
ter HLDA goes active. In the typical configuration, 
bus control is returned to the host one bus state 
after the DREQn goes inactive. 


There are two cases where there may be more than 
one bus state of bus control latency at the end of a 
transfer. The first is at the end of an Auto-Initialize 
process, and the second is at the end of a process 
where the source is the Requester and Two-Cycie 
transfers are used. 


When a Buffer Auto-Initialize Porcess is complete, 
the 82370 requires seven bus states to reload the 
Current Registers from the Base Registers of the 
Auto-Initialized channel. The reloading is done while 
the 82370 is still the bus master so that it is prepared 
to service the channel immediately after relinquish- 
ing the bus, if necessary. 


[CHANNEL 5 
CHANNEL 4 | 


| CHANNEL 7 | 


| CHANNEL 3 | 
| | CHANNEL 2 
| CHANNEL 1 


| | CHANNEL 0 
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In the case where the Requester is the source, and 
Two-Cycle transfers are being used, there are two 
extra idle states at the end of the transfer process. 
This occurs due to the housekeeping in the DMA’s 
internal pipeline. These two idle states are present 
only after the very last Requester access, before the 
DMA Controller de-activates the HOLD signal. 


3.3.4 CHANNEL PRIORITY ARBITRATION 


DMA channel priority can be programmed into one 
of two arbitration methods: Fixed or Rotating. The 
four lower DMA channels and the four upper DMA 
channels operate as if they were two separate DMA 
controllers operating in cascade. The lower group of 
four channels (O-—3) is always prioritized between 
channels 7 and 4 of the upper group of channels (4- 
7). Figure 3-10 shows a pictorial representation of 
the priority grouping. 


The priority can thus be set up as rotating for one 
group of channels and fixed for the other, or any 
other combination. While in Fixed Priority, the pro- 
grammer can also specify which channel has the 
lowest priority. 


LOW PRIORITY 


HIGH PRIORITY 
290164-25 


Figure 3-10. DMA Priority Grouping 
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~The 82370 DMA Controller defaults to Fixed Priority. 

Channel 0 has the highest priority, then 1, 2, 3, 4, 5, 
6, 7. Channel 7 has the lowest priority. Any time the 
DMA Controller arbitrates DMA requests, the re- 
questing channel with the highest priority will be 
serviced next. 


Fixed Priority can be entered into at any time by a 
software command. The priority levels in effect after 
the mode switch are determined by the current set- 
ting of the Programmable Priority. 


Programmable Priority is available for fixing the prior- 
ity of the DMA channels within a group to levels oth- 
er than the default. Through a software command, 
the channel to have the lowest priority in a group 
can be specified. Each of the two groups of four 
channels can have the priority fixed in this way. The 
other channels in the group will follow the natural 
Fixed Priority sequence. This mode affects only the 
priority levels while operating with Fixed Priority. 


For example, if channel 2 is programmed to have the 
lowest priority in its group, channel 3 has the highest 
priority. In descending order, the other channels 
would have the following priority: (3,0,1,2),4,5,6,7 


(channel 2 lowest, channel 3 highest). If the upper — 


| CHANNEL 6 | 
H CHANNEL 7 | 


| PHANTOM | 
I CHANNEL 4 | 


Figure 3-11. Example of Programmed Priority 


f CHANNEL 2 | 
f CHANNEL 1 


CHANNEL 0 
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group were programmed to have channel 5 as the. 
lowest priority channel, the priority would be (again, 
highest to lowest): 6,7, (3,0,1,2), 4,5. Figure 3-11 
shows this example pictorially. The lower group is 
always prioritized as a fifth channel of the upper 
group (between channels 4 and 7). 


The DMA Controller will only accept Programmable 
Priority commands while the addressed group is op- 
erating in Fixed Priority. Switching from Fixed to Ro- 
tating Priority preserves the current priority levels. 
Switching from Rotating to Fixed Priority returns the 
priority levels to those which were last programme 

by use of Programmable Priority. 


Rotating Priority allows the devices using DMA to 
share the system bus more evenly. An individual 
channel does not retain highest priority after being 
serviced, priority is passed to the next highest priori- 
ty channel in the group. The channel which was 
most recently serviced inherits the lowest priority. 
This rotation occurs each time a channel is serviced. 
Figure 3-12 shows the sequence of events as priori- 
ty is passed between channels. Note that the lower 
group rotates within the upper group, and that serv- 
icing a channel within the lower group causes rota- 
tion within the group as well as rotation of the upper 
group. 


LOW PRIORITY 


HIGH PRIORITY 
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DREQ2 and DREQ6—process channel 2 


Slee '3}o|1]2 — channel 2 drops to low- 


est priority within group. 
Lower group drops to 
lowest priority within up- 
per group. (Double Rota- 
tion) 


DREQ6 (still) and DREQ7—process channel 6 


| [2 [elolsl2] lelstel 


— channel 6 drops to low- 
est priority within group 


DREQ? (still) and DREQO—process channel 7 


2] [4[s|6|7 — channel 7 drops to low- 


est priority within group 
DREQO (still) and DREQ1—process channel 0 


4{5|6|7| [1]2]3]0]_ channel 0 drops to low- 


est priority within group. 
(Double Rotation) 


DREQ1 (still)—process channel 1 


a]s|elz 213 0;1!— channel -1 drops to low- 


~ est priority within group 


Figure 3-12. Rotating Channel Priority. 
Lower and upper groups are programmed 
for the Rotating Priority Mode. 


3.3.5 COMBINING PRIORITY MODES 


Since the DMA Controller operates as two four- 
channel controllers in cascade, the overall priority 
scheme of all eight channels can take on a variety of 
forms. There are four possible combinations of prior- 
ity modes between the two groups of channels: 
Fixed Priority only (default), Fixed Priority upper 
group/Rotating Priority lower group, Rotating Priority 
upper group/Fixed Priority lower group, and Rotating 
Priority only. Figure 3-13 illustrates the operation of 
the two combined priority methods. 
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Case 1— | 
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- High ~ Low 


[sls 7 
After servicing channel + [é]s|6|7| 


Default priority 
After servicing channel 2 


After servicing channel 6 | 


Case 2— 
0= 3 Rotating Priority, 4-7 Fixed Priority 
on - are 


ols f23] 
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Figure 3-13. Combining Priority Modes 


3.3.6 BUS OPERATION 


Data may be transferred by the DMA Controller us- 
ing two different bus cycle operations: Fly-By (one- 
cycle) and Two-Cycle. These bus handshake meth- 
ods are selectable independently for each channel 
through a command register. Device data path 


widths are independently programmable for both 


Target and Requester. Also selectable through soft- 
ware is the direction of data transfer. All of these 
parameters affect the operation of the 82370 ona 
bus-cycle by bus-cycle basis. 


3.3.6.1 Fly-By Transfers 


The Fly-By Transfer Mode is the fastest and most 
efficient way to use the 82370 DMA Controller to 
transfer data. In this method of transfer, the data is 
written to the destination device at the same time it 
is read from the source. Only one bus cycle is used 
to accomplish the transfer. 
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In the Fly-By Mode, the DMA acknowledge signal is 
used to select the Requester. The DMA Controller 
simultaneously places the address of the Target on 
the address bus. The state of M/IO# and W/R# 
during the Fly-By transfer cycle indicate the type of 
Target and whether the Target is being written to or 
read from. The Target’s Bus Size is used as an in- 
crementer for the Byte Count. The Requester ad- 
dress registers are ignored during Fly-By transfers. 


Note that memory-to-memory transfers cannot be 
done using the Fly-By Mode. Only one memory of 
|/O address is generated by the DMA Controller at a 
time during Fly-By transfers. Only one of the devices 
being accessed can be selected by an address. 
Also, the Fly-By method of data transfer limits the 
hardware to accesses of devices with the same data 
bus width. The Temporary Registers are not affect- 
ed in the Fly-By Mode. 


Fly-By transfers also require that the data paths of 
the Target and Requester be directly connected. 
This requires that successive Fly-By access be to 
word boundaries, or that the Requester be capable 
of switching its connections to the data bus. 


3.3.6.2. Two-Cycle Transfers 


Two-Cycle transfers can also be performed by the 


82370 DMA Controller. These transfers require at 


least two bus cycles to execute. The data being 
transferred is read into the DMA Controller’s Tempo- 
rary Register during the first bus cycie(s). The sec- 
ond bus cycle is used to write the data from the 
Temporary Register to the destination. 


lf the addresses of the data being transferred are — 


not word aligned, the 82370 wiil recognize the situa- 
tion and read and write the data in groups of bytes, 
placing them always at the proper destination. This 
process of collecting the desired bytes and putting 
them together is called “byte assembly’. The re- 
verse process (reading from aligned locations and 
writing to non-aligned locations) is called “byte dis- 
assembly”’. 


The assembly/disassembly process takes place 
transparent to the software, but can only be done 
while using the Two-Cycle transfer method. The 
82370 will always perform the assembly/disassem- 
bly process as necessary for the current data trans- 
fer. Any data path widths for either the Requester or 
Target can be used in the Two-Cycle Mode. This is 
very convenient for interfacing existing 8- and 16-bit 
peripherals to the 80376’s 16-bit bus. 


The 82370 DMA Controller always reads and write 
data within the word boundaries; i.e. if a word to be 
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read is crossing a word boundary, the DMA Control- — 
ler will perform two read operations, each reading 
one byte, to read the 16-bit word into the Temporary 
Register. Also, the 82370 DMA Controller always at- 
tempts to fill the Temporary Register from the 
source before writing any data to the destination. If 
the process is terminated before the Temporary 
Register is filled (TC or EOP #), the 82370 will write 
the partial data to the destination. If a process is 
temporarily suspended (such as when DREQn is de- 
activated during a demand transfer), the contents of 
a partially filled Temporary Register will be stored 
within the 82370 until the process is restarted. 


For example, if the source is specified as an 8-bit 
device and the destination as a 32-bit device, there 
will be four reads as necessary from the 8-bit source 
to fill the Temporary Register. Then the 82370 will 
write the 32-bit contents to the destination in two 
cycles of 16-bit each. This cycle will repeat until the 
process is terminated or suspended. 


With Two-Cycle transfers, the devices that the 
82370 accesses can reside at any address within 
[/O or memory space. The device must be able to 
decode the byte-enables (BLE#, BHE #). Also, if the 
device cannot accept data in byte quantities, the 
programmer must take care not to allow the DMA 
Controller to access the device on anys address oth- 
er than the device boundary. 


3.3.6.3 Data Path Width and Data Transfer Rate 
Considerations 


The number of bus cycles used to transfer a single 
“word” of data is affected by whether the Two-Cycle 
or the Fly-By (Single-Cycle) transfer method is used. 


The number of bus cycles used to transfer data di- 
rectly affects the data transfer rate. inefficient use of 
bus cycles will decrease the effective data transfer 
rate that can be obtained. Generally, the data trans- 
fer rate is halved by using Two-Cycle transfers in- 
stead of Fly-By transfers. | 


The choice of data path widths of both Target and 
Requester affects the data transfer rate also. During 
each bus cycle, the largest pieces of data peer 
should be transferred. 


The data path width of the devices to be accessed 
must be programmed into the DMA controller. The 
82370 defaults after reset to 8-bit-to-8-bit data trans- 
fers, but the Target and Requester can have differ- 
ent data path widths, independent of each other and 
independent of the other channels. Since this is a 
software programmable function, more discussion of 
the uses of this feature are found in the section on 
programming. 
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3.3.6.4 Read, Write and Verify Cycles 


Three different bus cycles types may be used ina 
data transfer. They are the Read, Write and Verify 
cycles. These cycle types dictate the way in which 
the 82370 operates on the data to be transferred. 


A Read Cycle transfers data from the Target to the 


Requester. A Write Cycle transfers data from the 


Requester to the target. In a Fly-By transfer, the ad- 
dress and bus status signals indicate’ the access 
(read of write) to the Target; the access to the Re- 
quester is assumed to be the opposite. 


The Verify Cycle is used to perform a data read only. 
No write access is indicated or assumed in a Verify 
_ Cycle. The Verify Cycle is useful for validating block 
fill operations. An external comparator must be pro- 
vided to do any comparisons on the data read. 


3.4 Bus Arbitration and Handshaking 


Figure 3-14 shows the flow of events in the DMA 
request arbitration process. The arbitration se- 
quence starts when the Requester asserts a DREQn 
(or DMA service is requested by software). Figure 


3-15 shows the timing of the sequence of events 
following a DMA request. This sequence is executed 


for each channel that is activated. The DREQnh sig- 
nal can be replaced by a software DMA channel re- 
quest with no change in the sequence. . 


After the Requester asserts the service request, the 
82370 will request control of the bus via the HOLD 
signal. The 82370 will always assert the HOLD sig- 
nal one bus state after the service request is assert- 
ed. The 80376 responds by asserting the HLDA sig- 
nal, thus releasing control of the bus to the 82370 
DMA Controller. m4 


Priority of pending DMA service requests is arbitrat- 
ed during the first state after HLDA is asserted by 
the 80376. The next state will be the beginning of 
the first transfer access of the highest priority pro- 
cess. 


When the 82370 DMA Controller is finished with its 
current bus activity, it returns control of the bus to 
the host processor. This is done by driving the 
HOLD signal inactive. The 82370 does not drive any 
address or data bus signals after HOLD goes low. It 
‘enters the Slave Mode until another DMA process is 


requested. The processor acknowledges that it has. 


82370 


regained control of the bus by forcing the HLDA sig- 


‘nal inactive. Note that the 82370’s DMA Controller 


will not re-request control of the bus until the entire 
HOLD/HLDA handshake sequence is complete. 


WAIT FOR DREQn OR SOFTWARE REQUEST |. 


REQUESTER ASSERTS DREQn . 


82370 ASSERTS HOLD REQUEST 


§ §=80376 ASSERTS HOLD ACKNOWLEDGE | 


82370 ARBITRATES PENDING REQUESTS 


82370 PERFORMS HIGHEST PRIORITY 
TRANSFER (SEE DATA TRANSFER MODES). 


| 82370 DE-ASSERTS HOLD REQUEST 
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_ Figure 3-14. Bus Arbitration and DMA Sequence 


The 82370 DMA Controller will terminate a current 
DMA process for one of three reasons: expired’byte 
count, end-of-process command (EOP# activated) 
from a peripheral, or deactivated DMA request sig- _ 
nal. In each case, the controller will de-assert HOLD 
immediately after completing the data transfer in 
progress. These three methods of process termina- 


‘tion are illustrated in Figures 3-16, 3-19 and 3-18, 


respectively. 


An expired byte count indicates that the current pro- 
cess is complete as programmed and the channel 
has no further transfers to process. The channel 
must be restarted according to the currently pro- 
grammed Buffer Transfer Mode, or reprogrammed 
completely, including a new Buffer Transfer Mode. 
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Figure 3-15. Beginning of a DMA process 


If the peripheral activates the EOP # signal, it is indi- 
cating that it will not accept or deliver any more data 
for. the current buffer. The 82370 DMA Controller 


considers this as a completion of the channel’s cur- 


rent process and interprets the condition the same 
way as if the byte count expired. 

The action taken by the 82370 DMA Controller in 
response to a de-activated DREQn signal. depends 
on the Data Transfer Mode of the channel. In the 
' Demand Mode, data transfers will take place as long 
as the DREQn is active and the byte count has not 
expired. In the Block Mode, the controller will com- 
plete the entire block transfer without relinquishing 
the bus, even if DREQn goes inactive before the 


transfer is complete. in the Single Mode, the control- 
ler will execute single data transfers, relinquishing — 
the bus between each transfer, as long as DREQn is 
active. 


Normal termination of a DMA process due to expira- 
tion of the byte count (Terminal Count—TC) is 
shown if Figure 3-16. The condition of DREQn is 
ignored until after the process is terminated. If the 
channel is programmed to auto-initialize, HOLD will 
be held active for an additional seven clock cycles 
while the auto-initialization takes place. — 


Table 3-3 shows the DMA channel activity due to 
EOP # or Byte Count expiring (Terminal Count). 


Table 3-3. DMA Channel Activity Due to Terminal Count or External EOP # 


Single or 
Chaining-Base 
Empty 


Buffer Process 


EVENT 


Terminal Count | True 
EOP # XxX. 


RESULTS 


Current Registers 
Channel Mask 

EOP # Output 
Terminal Count Status 
Software Request 


Auto- 
Initialize 


Chaining-Base 
Loaded 
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Figure 3-16. Termination of a DMA Process Due to Expiration of Current Byte Count | 
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Figure 3-17. Switching between Active DMA Channels 


The 82370 always relinquishes control of the bus 
between channel services. This allows the hardware 
designer the flexibility to externally arbitrate bus hold 
requests, if desired. If another DMA request is pend- 
ing when a higher priority channel service is com- 
pleted, the 82370 will relinquish the bus until the 
hold acknowledge is inactive. One bus state after 
the HLDA signal goes inactive, the 82370 will assert 
HOLD again. This is illustrated in Figure 3-17. 


3.4.1 SYNCHRONOUS AND ASYNCHRONOUS 
SAMPLING OF DREQn AND EOP# 


- As an indicator that a DMA service is to be started, 


DREQn is always sampled asynchronous. It is sam- 


pled at the beginning of a bus state and acted upon 
at the end of the state. Figure 3-15 illustrates the 
start of a DMA process due to a DREQn input. 


~ The DREQn and EOP # inputs can be programmed 


to be sampled either synchronously or asynchro- 


_nously to signal the end of a transfer. 


The synchronous mode affords the Requester one 
bus state of extra time to react to an access. This 
means the Requester can terminate a process on 
the current access, without losing any data. The 
asynchronous mode requires that the input signal be 
presented prior to the beginning of the last state of 
the Requester access. 
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The timing relationships of the DREQn and EOP# 
signals to the termination of a DMA transfer are 
shown in Figures 3-18 and 3-19. Figure 3-18 shows 
the termination of a DMA transfer due to inactive 
DREQn. Figure 3-19 shows the termination of a 
DMA process due to an active EOP # input. 


In the Synchronous Mode, DREQn and EOP# are 
sampled at the end of the last state of every Re- 
quester data transfer cycle. If EOP# is active or 
DREQn is inactive at this time, the 82370 recognizes 
this access to the Requester as the last transfer. At 
this point, the 82370 completes the transfer in prog- 
ress, if necessary, and returns bus control to the 
host. 


ADS# 
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Figure 3-18. Termination of a DMA Process due to De-Asserting DREQn 
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In the asynchronous mode, the inputs are sampled 
at the beginning of every state of a Requester ac- 
cess. The 82370 waits until the end of the state to 
act on the input. 


DREQn and EOP# are sampled at the latest possi- 
ble time when the 82370 can determine if another 
transfer is required. In the Synchronous Mode, 
DREQn and EOP# are sampled on the trailing edge 
of the last bus state before another data access cy- 
cle begins. The Asynchronous Mode requires that 
the signals be valid one clock cycle earlier. 
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Figure 3-19. Termination of a DMA Process due to an External EOP # 
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While in the Pipeline Mode, if the NA# signal is sam- 
pled active during a transfer, the end of the state 
where NA# was sampled active is when the 82370 
decides whether to commit to another transfer. The 
device must de-assert DREQn or assert EOP # be- 


. fore NA# is asserted, otherwise the 82370 will com- | 


mit to another, possibly undesired, transter. 


Synchronous DREQn and EOP# sampling allows 
the peripheral to prevent the next transfer from oc- 
curring by de-activating DREQn or asserting EOP # 
during the current Requester access, before the 
~ 82370 DMA Controller commits itself to another 
transfer. The DMA Controller will not perform the 
next transfer if it has not already begun the bus cy- 
cle. Asynchronous sampling allows less stringent 
timing requirements than the Synchronous Mode, 
but requires that the DREQn signal be valid at the 
beginning of the next to last bus state of the current 
Requester access. 


Using the Asynchronous Mode with zero wait states 
can be very difficult. Since the addresses and con- 
trol signals are driven by the 82370 near half-way 
through the first bus state of a transfer, and the 
Asynchronous Mode requires that DREQn be inac- 


tive before the end of the state, the peripheral being . 
accessed is required to present DREQn only a few. 


nanoseconds after the control information is avail- 


able. This means that the peripheral’s control logic — 


must be extremely fast (practically non-causal). An 
alternative is the Synchronous Mode. 
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3.4.2 ARBITRATION OF CASCADED MASTER 
REQUESTS 


The Cascade Mode allows another DMA-type de- 
vice to share the bus by arbitrating its bus accesses 
with the 82370’s. Seven of the eight DMA channels 
(O-3 and 5-7) can be connected to a cascaded de- 
vice. The cascaded device requests bus control 
through the DREQn line of the channel which is pro- 
grammed to operate in Cascade Mode. Bus hold ac- 
knowledge is signalled to the cascaded device 
through the EDACK lines. When the EDACK lines 
are active with the code for the requested cascade 
channel, the bus is available to the cascaded master 
device. 


A cascade cycle begins the same way a regular 
DMA cycle begins. The requesting bus master as- 
serts the DREQnh line on the 82370. This bus control 
request is arbitrated as any other DMA request 
would be. If any channel receives a DMA request, 
the 82370 requests control of the bus. When the 
host acknowledges that it has released bus control, 
the 82370 acknowledges to the requesting master 
that it may access the bus. The 82370 enters an idle 
state until the new master relinquishes control. 


A cascade cycle will be terminated by one of two 


events: DREQn going inactive, or HLDA going inac- 
tive. The normal way to terminate the cascade cycle 


Bus Moster 0 
HOLD REQUEST 


HOLD ACKNOWLEDGE 


latched 
# decoder 


7 & | HOLD ACKNOWLEDGE 
, Bus Master 7 
HOLD REQUEST 
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_ Figure 3-20. Cascaded Bus Master 
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is for the cascaded master to drop the DREQn sig- 
nal. Figure 3-21 shows the two cascade cycle termi- 
nation sequences. 


The Refresh Controller may interrupt the cascaded 
master to perform a refresh cycle. If this occurs, the 
82370 DMA Controller will de-assert the EDACK sig- 
nal (hold acknowledge to cascaded master) and wait 
for the cascaded master to remove its hold request. 
When the 82370 regains bus control, it will perform 
the refresh cycle in its normal fashion. After the re- 
fresh cycle has been completed, and if the cascad- 
ed device has re-asserted its request, the 82370 will 
return control to the cascaded master which was in- 
terrupted. 


The 82370 assumes that it is the only device moni- 
toring the HLDA signal. if the system designer 
wishes to place other devices on the bus as bus 
masters, the HLDA from the processor must be in- 
tercepted before presenting it to the 82370. Using 


the Cascade capabililty of the 82370 DMA Controller — 


offers a much better solution. 


3.4.3 ARBITRATION OF REFRESH REQUESTS 


The arbitration of refresh requests by the DRAM Re- 
fresh Controller is slightly different from normal DMA 
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channel request arbitration. The 82370 DRAM Re- 
fresh Controller always has the highest priority of 
any DMA process. It also can interrupt a process in 
progress. Two types of processes in progress may 
be encountered: normal DMA, and bus master cas- 
cade. e: 


In the event of a refresh request during a normal 
DMA process, the DMA Controller will complete the 
data transfer in progress and then execute the re- 
fresh cycle before continuing with the current DMA 
process. The priority of the interrupted process is 
not lost. If the data transfer cycle interrupted by the 
Refresh Controller is the last of a DMA process, the 
refresh cycle will always be executed before control 
of the bus is transferred back to the host. 


When the Refresh Coniroller request occurs during 
a cascade cycle, the Refresh Controller must be as- 
sured that the cascaded master device has relin- 
quished control of the bus before it can execute the 
refresh cycle. To do this, the DMA Controller drops 
the EDACK signal to the cascaded master and waits 
for the corresponding DREQn input to go inactive. 
By dropping the DREQn signal, the cascaded mas- 
ter relinquishes the bus. The Refresh Controller then » 
performs the refresh cycle. Control of the bus is re- 
turned to the cascaded master if DREQn returns to 
an active state before the end of the refresh cycle, 
otherwise control is passed to the processor and the 
cascaded master loses its priority. 


Cascade cycle termination by DREQn inactive 


DREQn 


EDACK - 


100 


HOLD = paras \ | 


Figure 3-21. Cascade Cycle Termination | 
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3.5 DMA Controller Register Overview 


The 82370 DMA Controller contains 44 registers 
which are accessable to the host processor. 7 wen- 
ty-four of these registers contain the device ad- 
dresses and data counts for the individual DMA 
channels (three per channel). The remaining regis- 
ters are control and status registers for initiating and 
monitoring the operation of the 82370 DMA Control- 
ler. Table 3-4 lists the DMA Coniroller’s registers 
and their accessability. - 


Table 3-4, DMA Controller Registers | 


Register Name Access 


Control/Status Registers—one each per group 


Command Register | write only 
Command Register II write only 
Mode Register | write only 
Mode Register II write only 
Software Request Register read/write 
Mask Set-Reset Register write only 
_ Mask Read-Write Register read/write 
' Status Register | read only 
Bus Size Register write only | 
Chaining Register read/write 
Channel Registers—one each per channel 
Base Target Address write only | 
Current Target Address read only 
Base Requester Address write only 
Current Requester Address read only 
Base Byte Count write only 
Current Byte Count read only 


3.5.1 CONTROL/STATUS REGISTERS 


The following registers are available to the host. 


processor for programming the 82370 DMA Control- 


ler into its various modes and for checking the oper- - 


ating status of the DMA processes. Each set of four 


DMA channels has one of each of these registers 


associated with it. 
Command Register | 


Enables or disables the DMA channel as a group. 
Sets the Priority Mode (Fixed or Rotating) of. the 
group. This write-only register is cleared by a hard- 
ware reset, defaulting to all channels enabled and 
Fixed Priority Mode. 


Command Register II 


Sets the sampling mode of the DREQn and EOP # 
inputs. Also sets the lowest priority channel for the 
group in the Fixed Priority Mode. The functions pro- 
grammed through Command Register |i default after 
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a hardware reset to: asynchronous DREQn and 
EOP#, and channels 3 and 7 lowest priority. | 


Mode Registers | 


Mode Register | is identical in function to the Mode 


register of the 8237A. It programs the following func- 


tions for an individually selected channel: 


Type of Transfer—read, write, verify 
Auto-Initialize—enable or disable | 

Target Address Count—increment or decrement 
Data Transfer Mode—demand, single, block, 
cascade 


Mode Register | functions default to the following 
after reset: verify transfer, Auto-Initialize disabled, In- 
crement Target address, Demand Mode. 


Mode Register Il 


Programs the following functions for an individually 
selected channei: 


Target Address Hold—enabie or disable 
Requester Address Count—increment or 

- decrement 
Requester Address figidenauis or disable 
Target Device Type—I/O or Memory 
Requester Device Type—i/O or Memory 
Transfer Cycles—Two-Cycle or Fly-By 


Mode Register II functions are defined as follows 


after a hardware reset: Disable Target Address Hold, 


Increment Requester Address, Target (and Re- 
quester) in memory, Fly-By Transfer Cycles. Note: 
Requester Device Type ignored in Fly-By Transfers. 


Software Request Register 


The DMA Controller can respond to service requests 
which are initiated by software. Each channel has an 
internal request status bit associated with it. The 
host processor can write to this register to set or 
reset the request bit of a selected channel. 


The status of a group’s software DMA service re- 
quests can be read from this register as well. Each 
status bit is cleared upon Terminal Count or external 
EOP#. 


The software DMA requests are non-maskablie .and 
subject to priority arbitration with all other software 
and hardware requests. The entire register is 
cleared by a hardware reset. 


Mask Registers 


Each channel has: associated with it a mask bit 
which can be set/reset to disable/enable that chan- 
nel. Two methods are available for setting and clear- 
ing the mask bits. The Mask Set/Reset Register is a 
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write-only register which allows the host to select an 
_ individual channel and either set or reset the mask 
bit for that channel only. The Mask Read/Write Reg- 
ister is available for reading the mask bit status and 
for writing mask bits in groups of four. 


The mask bits of a group may be cleared in one step 
by executing the Clear Mask Command. See the 
DMA Programming section for details. A hardware 
reset sets all of the channel mask bits, disabling all 
channels. 


Status Register 


The Status register is a read-only register which con- 
tains the Terminal Count (TC) and Service Request 
status for a group. Four bits indicate the TC status 
and four bits indicate the hardware request status 
for the four channels in the group. The TC bits are 
set when the Byte Count expires, or when and exter- 
nal EOP# is asserted. These bits are cleared by 
reading from the Status Register. The Service Re- 
quest bit for a channel indicates when there is a 
hardware DMA request (DREQn) asserted for that 
channel. When the request has been removed, the 
bit is cleared. 


Bus Size Register 


This write-only register is used to define the bus size 
of the Target and Requester of a selected channel. 
The bus sizes programmed will be used to dictate 
the sizes of the data paths accessed when the DMA 
channel is active. The values programmed into this 
register affect the operation of the Temporary Regis- 
ter. When 32-bit bus width is programmed, the 
82370 DMA Controller will access the device twice 
through its 16-bit external Data Bus to perform a 
32-bit data transfer. Any byte-assembly required to 
make the transfers using the specified data path 
widths will be done in the Temporary Register. The 
Bus Size register of the Target is used as an incre- 
ment/decrement value for the Byte Counter and 


Target Address when in the Fly-By Mode. Upon re-. 


set, all channels default to 8-bit Targets and 8-bit 
Requesters. | 


Chaining Register 


As a command or write register, the Chaining regis- 
ter is used to enable or disable the Chaining Mode 
for a selected channel. Chaining can either be dis- 
abled or enabled for an individual channel, indepen- 
dently of the Chaining Mode status of other chan- 
nels. After a hardware reset, all channels default to 
Chaining disabled. 


When read by the host, the Chaining Register pro- 
vides the status of the Chaining Interrupt of each of 
the channeis. These interrupt status bits are cleared 
when the new buffer information has been loaded. 
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3.5.2 CHANNEL REGISTERS 


Each channel has three individually programmable 
registers necessary for the DMA process; they are 
the Base Byte Count, Base Target Address, and 
Base Requester Address registers. The 24-bit Base 
Byte Count register contains the number of bytes to 
be transferred by the channel. The 24-bit Base Tar- 
get Address Register contains the beginning ad- 
dress (memory or |/Q) of the Target device. The 
24-bit Base Requester Address register contains the © 
base address (memory or !/O) of the device which is 
to request DMA service. 


Three more registers for each DMA channel exist 
within the DMA Controller which are directly related 
to the registers mentioned above. These registers 
contain the current status of the DMA process. They 


~ are the Current Byte Count register, the Current Tar- 


get Address, and the Current Requester Address. It 
is these registers which are manipulated (increment- 
ed, decremented, or held constant) by the 82370 
DMA Controller during the DMA process. The Cur- 
rent registers are loaded from the Base registers a 
the beginning of a DMA process. | 


The Base registers are loaded when the host proc- 
essor writes to the respective channel register ad- 
dresses. Depending on the mode in which the chan- 
nel is operating, the Current registers are typically 
loaded in the same operation. Reading from the 
channel register addresses yields the contents of 
the corresponding Current register. 


To maintain compatibility with software which ac- 
cesses an 8237A,:a Byte Pointer Flip-Flop is used to 
control access to the upper and lower bytes of some 
words of the Channel Registers. These words are 
accessed as byte pairs at single port addresses. The 
Byte Pointer Flip-Flop acts as a one-bit pointer 
which is toggled each time a qualifying Channel 
Register byte is accessed. 


It always points to the next logical byte to be ac- 
cessed of a pair of bytes. 


The Channel registers are arranged as pairs of. 
words, each pair with its own port address. Address- 
ing the port with the Byte Pointer Flip-Flop reset ac- 
cesses the least significant byte of the pair. The 
most significant byte is accessed when the Byte 
Pointer is set. | 


For compatibility with existing 8237A designs, there 
is one exception to the above statements about the 
Byte Pointer Flip-Flop. The third byte (bits 16-23) of 
the Target Address is accessed through its own port 
address. The Byte Pointer Flip-Flop is not affected 
by any accesses to this byte. 
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The upper eight bits of the Byte Count Register are 


cleared when the least significant byte of the regis- 
ter is loaded. This provides compatibility with soft- 
ware which accesses an 8237A. The 8237A has 
16-bit i Count Registers. 


NOTE: 
The 82370 is ‘a subset of the Intel 82380 32-bit 
DMA Controller with Integrated System Peripherals. 


Although the 82370 has 24 address bits externally, 
the programming model is actually a full 32 bits wide. 
For this reason, there are some “hidden” DMA reg- 
isters in the 82370 register set. These hidden regis- 
ters correspond to what woul be A24-A31 in a 
32- bit system. 


Think of the 82370 addresses as though they were 
32 bits wide, with only the lower 24 bits available 
externally. 


This should be of concern in two areas: 


1. Understanding the Byte Pointer Flip Flop _ 
2. Removing the IRQ1 Chaining Interrupt 


The byte pointer flip flop will behave as though the 
_ hidden upper address bits were accessible. 


The IRQ1 Chaining !nterrupt will be removed only 
when the hidden upper address bits are pro- 
grammed. You will note that since the hidden upper 
address bits are not available externally, the value 
‘you program into the registers is not important. The 
act of programming the hidden register is critical in 
removing the IRQ1 Chaining interrupt for a DMA 
channel. 


The port assignments for these hidden upper ad- 
dress bits come directly from the port assignments 
of the Intel 82380. For your convenience, those port 
definitions have been included in this data sheet in 
section 3.7. 


3.5.3 TEMPORARY REGISTERS 


Each channel has a 32-bit Temporary Register used 
for temporary data storage during two-cycle DMA 
transfers. It is this register in which any necessary 
byte assembly and disassembly of non-aligned data 
is performed. Figure 3-22 shows how a block of data 


will be moved between memory locations with differ- _ 


ent boundaries. Note that the order of the data does 
not change. 


If the destination is the Requester and an early pro- 
cess termination has been indicated by the EOP # 
signal or DREQn inactive in the Demand Mode, the 
Temporary Register is not affected. If data remains 
in the Temporary Register due to differences in data 
path widths of the Target and Requester, it will not 


82370 


Destination 


55H 
56H 
57H» 
58H 
59H 
— 5AH 
Target = source = 00000020H 


Requester = destination = = 00000053H_ 
eye Count = 000007H 


Figure 3-22. Transfer of data between memory 
locations with different boundaries. This will be 
the result, midependent of data path width. 


be transferred or Sihenwiss lost, but will. be stored for 
later transfer. 


lf the destination is the Target and the EOP # signal 
is sensed active during the Requester access of a 
transfer, the DMA Controller will complete the trans- 
fer by sending to the Target whatever information is 
in. the Temporary Register at the time of process 
termination. This implies that the Target could be 
accessed with partial data in two accesses. For this 


reason itis advisable to have an |/O device desig- 


nated as a Requester, unless it is capable of han- 
amg partial data transfers. . 


3.6 DMA Gonteolier programming 


Programming a DMA Channel to perform a needed 
DMA function is in general a four step process. First 
the global attributes of the DMA Controller are pro- 
grammed via the two Command Registers. These 
global attributes include: priority levels, channel 
group enables, priority mode, and DREQn/ EOP # in- 
put sampling. 


The second step involves setting the operating 
modes of the particular channel. The Mode Regis- 
ters are used to define the type of transfer and the 
handshaking modes. The Bus Size Register and 
Chaining Register may also need to be programmed 
in this step. 


The third step in setting up the channel is to load the 
Base Registers in accordance with the needs of the 
operating modes chosen in step two. The Current 
Registers are automatically loaded from the Base 
Registers, if required by the Buffer Transfer Mode in 
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effect. The information loaded and the order in 
which it is loaded depends on the operating mode. A 
channel used for cascading, for example, needs no 
buffer information and this step can be skipped en- 
tirely. 


The last step is to enable the newly programmed 
channel using one of the Mask Registers. The chan- 
nel is then available to perform the desired data 
transfer. The status of the channel can be observed 
at any time through the Status Register, Mask Reg- 
ister, Chaining Register, and Software Request reg- 
ister. 


Once the channel is programmed and enabled, the 
DMA process may be initiated in one of two ways, 
either by a hardware DMA request (DREQn) or a 
software request (Software Request Register). 


Once programmed to a particular Process/Mode 
configuration, the channel will operate in that config- 
uration until programmed otherwise. For this reason, 


restarting a channel after the current buffer expires — 


does not require complete reprogramming of the 
channel. Only those parameters which have 
changed need to be reprogrammed. The Byte Count 
Register is always changed and must be repro- 
grammed. A Target or Requester Address Register 
which is incremented or decremented should be re- 
programmed also. 
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3.6.1 BUFFER PROCESSES 


The Buffer Process is determined by the Auto-Initial- 
ize bit of Mode Register | and the Chaining Register. 
lf Auto-Initialize is enabled, Chaining should not be 
used. 


3.6.1.1 Single Buffer Process 


The Single Buffer Process is programmed by dis- 
abling Chaining via the Chaining Register and pro- 
gramming Mode Register | for non-Auto-initialize. 


3.6.1.2 Buffer Auto-Initialize Process 


Setting the Auto-Initialize bit in Mode Register | is all 
that is necessary to place the channel in this mode. 
Buffer Auto-Initialize must not be enabled simulta- 
neous to enabling the Buffer Chaining Mode as this 
will have unpredictable results. 


Once the Base Registers are loaded, the channel is 
ready to be enabled. The channel will reload its Cur- 
rent Registers from the Base Registers each time 
the Current Buffer expires, either by an expired Byte 
Count or an external EOP #. 


INSTALL IRQ1 INTERRUPT SERVICE ROUTINE 


SET THE CHANNEL TO NON=CHAINING PROCESS 


LOAD BASE REGISTERS FOR FIRST BUFFER 


m 
LJ 
oS 


: PROGRAM THE MODE REGISTERS 


SET THE CHANNEL TO CHAINING PROCESS 


(IRQ1 WILL BE ACTIVATED) 


| ENABLE INTERRUPT 


(IRQ1 WILL NEED SERVICE= 


LOAD BASE REGISTERS) 


ENABLE THE CHANNEL 


[FROM THIS POINT, THE HOST CAN PERFORM 
ANOTHER TASK. THE INTERRUPT SERVICE ROUTINE 


| _LEFT BEHIND WILL MAINTAIN THE CHANNEL. 
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Figure 3-23. Flow of Events in the Buffer Chaining Process | 
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3.6.1.3 Buffer Chaining Process | 


The Buffer Chaining Process is entered into from the 
Single Buffer Process. The Mode Registers should 
be programmed first, with all of the Transfer Modes 
defined as if the channel were to operate in the Sin- 


gle Buffer Process. The channel’s Base Registers » 


are then loaded. When the channel has been set up 
in this way, and the chaining interrupt service routine 
is in place, the Chaining Process can be entered by 
programming the Chaining Register. Figure 3-23 il- 
lustrates the Buffer Chaining Process. 


An interrupt (IRQ1) will be generated immediately af- 
ter the Chaining Process is entered, as the channel 
then perceives the Base Registers as empty and in 
need of reloading. It is important to have the inter- 
rupt service routine in place at the time the Chaining 
Process is entered into. The interrupt request is re- 
moved when the most significant byte of the Base 
Target Address is loaded. | 


The interrupt will occur again when the first buffer 
expires and the Current Registers are loaded from 
the Base Registers. The cycle continues until the 
Chaining Process is disabled, or the host fails to re- 
spond to IRQ1 before the Current Buffer expires. 


Exiting the Chaining Process can be done by reset- 
ting the Chaining Mode Register. If an interrupt is 
pending for the channel when the Chaining Register 
is reset, the interrupt request will be removed. The 
Chaining Process can be temporarily disabled by 
setting the channel’s Mask bit in the Mask Register. 


The interrupt service routine for IRQ1 has the re- 


sponsibility of reloading the Base Registers as nec- 


essary. It should check the status of the channel to 
determine the cause of channel expiration, etc. It 
should also have access to operating system infor- 
mation regarding the channel, if any exists. The 
IRQ1 service routine should be capable of determin- 
ing whether the chain should be continued or termi- 
nated and act on that information. 


3.6.2 DATA TRANSFER MODES 


The Data Transfer Modes are selected via Mode 
Register |. The Demand, Single, and Block Modes 
are selected by bits D6 and D7. The individual trans- 
fer type (Fly-By vs Two-Cycle, Read-Write-Verify, 
and |/O vs Memory) is programmed through both of 
the Mode registers. 


3.6.3 CASCADED BUS MASTERS 


The Cascade Mode is set by writing ones to D7 and 
D6 of Mode Register |. When a channel is pro- 
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grammed to operate in the Cascade Mode, all of the 
other modes associated with Mode Registers | and II 
are ignored. The priority and DREQn/EOP# defini- 
tions of the Command Registers will have the same 
effect on the channel’s operation as any other 
mode. _ 


3.6.4 SOFTWARE COMMANDS 


There are five port addresses which, when written 
to, command certain operations to be performed by 
the 82370 DMA Controller. The data written to these 
locations is not of conseauence, writing to the loca- 
tion is all that is necessary to command the 82370 to 
perform the indicated function. Following are de- 
scriptions of the command functions. | 


Clear Byte Pointer Flip-Flop—Location 000CH 


Resets the Byte Pointer Flip-Flop. This command 
should be performed at the beginning of any access 
to the channei registers in order to be assured of 
beginning at a predictable place in the register pro- 
gramming sequence. | 


Master Clear—Location 000DH 
All DMA functions are set to their default states. This 


command is the equivalent of a hardware reset to 
the DMA Controller. Functions other than those in 


“the DMA Controller section of the 82370 are not af- 


fected by this command. 


Clear Mask Register— Channels 0-3 
— Location OOOEH 


Channels 4-7 
— Location OOCEH 


This command simultaneously clears the Mask Bits 


of all channels in the addressed group, enabling all 


of the channels in the group. 


Clear TC Interrupt Request—Location 001EH 


This command resets the Terminal Count Interrupt 
Request Flip-Flop. It is provided to allow the pro- 
gram which made a software DMA request to ac- 
knowledge that it has responded to the expiration of 
the requested channel(s). 


3.7 Register Definitions 


The following diagrams outline the bit definitions and 


functions of the 82370 DMA Controller’s Status and 


Control Registers. The function and programming of 
the registers is covered in the previous section on 
DMA Controller Programming. An entry of “Xx asa 
bit value indicates ‘don’t care.” 
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Channel! Registers (read Current, write Base) 


Register Name Address Byte Bits 
: (hex) Pointer Accessed 


Channel 0 Target Address 00 0-7 
8-15 
87 16-23 
10 24-31(*) 
Byte Count 01 0-7 
8-15 
11 16-23 
Requester Address 90 0-7 
8-15 
91 16-23 


24-31(*) 


Channel 1 


Channel 2 
Channel 3 


Target Address 


Byte Count 


Requester Address 


0=7 
8-15 
16-23 
24-31(*) 
0-7 
8-15 
16-23 
0=7 
8-15 
16-23 
24-31(*) 


Oa7°*: 
8-15 
16-23 
24-31(*) 
0-7 
8-15 
16-23 
0-7 
8-15 
16-23 
24-31(*) 


Target Address 


Byte Count 


Requester Address 


Target Address 


Byte Count 


Requester Address 


=-=-o- oo + OOK - O;/;?- 0 +f C0" 00x*K +O aon Cco-cox=3colH8 o- 00-06x 0 


95-1361 


Intel | 82370 
Channel Registers (read Current, write Base) (Continued) 


Address Byte Bits 
panacea __{hex) 


Channel 4 Target Address 0-7 
8-15 
16-23 
24-31(*) 
0-7 
8-15. 
16-23 
0-7 
8-15 
16-23 
24-31(*) 


Byte Count 


Requester Address 


Channel 5 Target Address 


Byte Count 


Requester Address 


Channel 6 


Target Address 0-7 
| 8-15 

16-23 
24-31(*) . 
0-7 
8-15 
16-23 
0-7 
8-15 
16-23 

24-31(*) 


Byte Count 


Requester Address 


Channel 7 Target Address 


Byte Count 


Requester Address 


HaO-00s300x=-0/]HO0=00=00x-0}/-000++]00x +0/+7A0+-00+00xX 0 


NOTE: 
(*)These bits are not available externally. You need to be aware of their existence for chaining and Byte Pointer Flip-Flop 
operations. Please see section 3.5.2 for further details. 
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Command Register { (write only) 


Port Addresses— Channels 0—3——-0008H 
Channels 4-—7-—00C8H 


D7? D6 DS D4 D3 D2 ODI ObDO 


ESES ESE ESCEESES 


_GROUP MASK 
0 = ENABLE CHANNELS 
1 = DISABLE CHANNELS 


PRIORITY 
O = FIXED PRIORITY 


1 = ROTATING PRIORITY 
290164-—36 


Command Register Ii (write only) 


Port Addresses— Channels 0-3-—001AH 
Channels 4—-7—OODAH 


D7 D6 DS D4 DS D2 ODI ODO 


DREQN SAMPLING 


EOP# SAMPLING 
0 = ASYNCHRONOUS 
1 = SYNCHRONOUS 


LOW PRIORITY LEVEL SET 
O00 = CHANNEL 0(4) LOWEST 


O01 = 1(5) 

10 = 2(6) 

11= 3(7) 

290164-37 
Mode Register !| (write only) 
Port Addresses— Channels 0—3--O00BH 
Channels 4—7—-OOCBH 
D7 D6 DS D4 D3 02 D1 DO 
(eTeoTr [a [Jr] 
| ; CHANNEL SELECT 

00 = CHANNEL 0(4) 

O01 = 1(5) 

10 = 2(6) 

1i= 3(7) 
TRANSFER TYPE 

0O = VERIFY 

O1 = WRITE 

10 = READ 

11 = ILLEGAL 

XX iF IN CASCADE MODE 
AUTO=INITIALIZE 

O = DISABLE, 1 = ENABLE 
TARGET INCREMENT/DECREMENT 
OG = INCREMENT TARGET 

1 = DECREMENT TARGET * 
X IF TARGET HOLD ENABLED 
DATA TRANSFER MODE 

00 = DEMAND MQDE 

Q1 = SINGLE TRANSFER MODE 
10 = BLOCK MODE 

11 = CASCADE MODE 


290164-~-38 


*Target and Requester DECREMENT is allowed only for byte transfers. 
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Mode Register Il (write only) 


Port Addresses— Channels 0-3—001BH 
Channels 4—7—-OODBH 


D7 D6 DS D4 D3 D2. Dt bo 
fxpTopmpmpR[ malo, 


CHANNEL SELECT 
SEE MODE REGISTER | 


TARGET HOLD - 
O = INCREMENT/DECREMENT 
1 = HOLD 


REQUESTER INCREMENT 

0 = INCREMENT 

1 = DECREMENT * 
X IF REQUESTER HOLD ENABLED 


REQUESTER HOLD 
0= INCREMENT /DECREMENT 
1 = HOLD — 

TARGET DEVICE TYPE 

REQUESTER DEVICE TYPE 
O = MEMORY 
1 = INPUT/OUTPUT 

TRANSFER CYCLES 
O = ONE=CYCLE (FLY=BY) 

1 = TWO-CYCLE . 
. 290164-39 


; *Target and Requester DECREMENT is allowed only for byte transfers. 
Software Request Register (read/write) 


Port Addresses— Channels 0-3—0009H 
Channels 4—-7—-O0C9H 


Write Format: | Software DMA Service Request 


CHANNEL SELECT 


SEE MODE REGISTER | 


REQUEST SERVICE 
O = REMOVE REQUEST 


1 = ASSERT REQUEST , 
290164-40 
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Read Format: Software Requests Pending 


en acer arene A SAA EN AE OOS A A Me CN CREE CCC 


D7 D6 DS D4 D3 D2 ODI ODO 1 = REQUEST PENDING 


» CHANNEL 0(4) REQUEST 
CHANNEL 1(5) REQUEST 
CHANNEL 2(6) REQUEST 
CHANNEL 3(7) REQUEST 


290164-41 


ea ane ae ce eee CERN CeCe eT cee an cineee ceo icannneieeianemeanaene eeabtereeneneteetae neem eneteteimempeammetenemene oameaiaante areas aamaemene eanatieteamen 


Mask Set/Reset Register Individual Channel Mask (write only) 


Port Addresses— Channels 0-3—-OO0AH 
Channels 4--7--QOCAH 


ee LH tA BLOT TENANCE MEE SANE EAE NE OH ORR remem 


CHANNEL SELECT 
SEE MODE REGISTER | 


MASK SET BIT 
O = CLEAR MASK 
1 = SET MASK 
| 290164~42 
Mask Read/Write Register Group Channel Mask (read/write) 
Port Addresses— Channeis 0-3—O00FH 
Channels 4-7—00CFH 
D7 D6 D5 D4 D3 D2 D1 DO 
CHANNEL 0(4) MASK BIT 
=~ CHANNEL 1(5) MASK BIT 
CHANNEL 2(6) MASK BIT 
~vammumccweems CHANNEL 3(7) MASK BIT 
MASK BIT = 6 CHANNEL ENABLED 
= 1 — CHANNEL DISABLED | 
290164-43 
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Status Register Channel Process Status (read om 


Port Addresses— Channels 0-3—0008H 
Channels 4—7—00C8H 


D7 D6 DS D4 D3 D2 ODI odo 
Pes [a2 | Ri] RO [103] 102] 101 | T00] 
CHANNEL 0(4) EXPIRED 
CHANNEL 1(5) EXPIRED 
CHANNEL 2(6) EXPIRED 


: — CHANNEL 3(7) EXPIRED 
1 = EXPIRED 


— CHANNEL 0(4) REQUEST 
. CHANNEL 1(5) REQUEST 
enema ———— CHANNEL 2(6) REQUEST 


CHANNEL 3(7) REQUEST 


’ 1 = REQUEST PENDING 
290164-44 


Bus Size Register Set Data Path Width (write only) 


Port Addresses— Channels 0—3—0018H 
Channels 4-7—00D8H 


[Resipreso] rast[reso] o | © | 1] | 


CHANNEL SELECT 
SEE MODE REGISTER | 


TARGET BUS SIZE 


= REQUESTER BUS SIZE 
290164--45 


Bus Size Encoding: 

00 = Reserved by Intel 10 = 16-bit Bus 

01 = 32-bit Bus* 11 = 8-bit Bus 
“If programmed as 32-bit bus width, the soreasoninge device will be accessed i in two 16-bit cycles provided that the data is 
aligned within word boundary. 


Chaining Register (read/write) 


Port Addresses— Channels 0-3—-0019H 
Channels 4-7—00D9H 


WRITE FORMAT: SET CHAINING MODE 


D7 D6 DS D4 DZ D2 ODI ODO 


CHANNEL SELECT 
‘SEE MODE REGISTER | 


CHAINING ENABLE BIT 
O = DISABLE CHAINING MODE 


1 = ENABLE CHAINING MODE 
: 290164—46 
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READ FORMAT: 


D7 D6 


CHANNEL INTERRUPT STATUS 


DO 


xT xx] x [ospoz[or foo, 


3.8 8237A Compatibility 


The register arrangement of the 82370 DMA Con- 
troller is a superset of the 8237A DMA Controller. 
Functionally the 82370 DMA Controller is very differ- 
ent from the 8237A. Most of the functions of the 
8237A are performed also by the 82370. The follow- 
ing discussion points out the differences between 
the 8237A and the 82370. 


The 8237A is limited to transfers between I/O and 
memory only (except in one special case, where two 
channels can be used to perform memory-to-memo- 
ry transfers). The 82370 DMA Controller can transfer 
between any combination of memory and I/O. Sev- 
eral other features of the 8237A are enhanced or 
expanded in the 82370 and other features are add- 
ed. 


The 8237A is an 8-bit only DMA device. For pro- 
gramming compatibility, all of the 8-bit registers are 
preserved in the 82370. The 82370 is programmed 
via 8-bit registers. The address registers in the 
82370 are 24-bit registers in order to support the 
80376’'s 24-bit bus. The Byte Count Registers are 
24-bit registers, allowing support of larger data 
blocks than possible with the 8237A. 


All of the 8237A’s operating modes are supported 
by the 82370 (except the cumbersome two-channel 
memory-to-memory transfer). The 82370 performs 
memory-to-memory transfers using only one chan- 
nel. The 82370 has the added features of buffer 
pipelining (Buffer Chaining Pigces?) and program- 
mable priority levels. 


The 82370 also adds the feature of address regis- 
ters for both destination and source. These address- 
es may be incremented, decremented, or heid con- 
stant, as required by the application of the individual 


channel. This allows any combination of destination 


and source device. 


CHANNEL 0(4) BASE EMPTY 
CHANNEL 1(5) BASE EMPTY 
CHANNEL 2(6) BASE EMPTY 


CHANNEL 3(7) BASE EMPTY . 
290164-47 


Each DMA channel has associated with it a Target 
and a Requester. In the 8237A, the Target is the 
device which can be accessed by the address regis- 
ter, the Requester is the device which is accessed 
by the DMA Acknowledge signals and must be an 
1/O device. 


4.0 PROGRAMMABLE INTERRUPT 
CONTROLLER (PIC) 


4.1 Functional Description 


The 82370 Programmable Interrupt Controller (PIC) 
consists of three enhanced 82C59A Interrupt Con- 


trollers. These three controllers together provide 15 
external and 5 internal interrupt request inputs. Each 
external request input can be cascaded with an ad- 
ditional 82C59A slave controller. This scheme al- 
lows the 82370 to support a maximum of 120 
(15 x 8) external interrupt request inputs. 


Following one or more interrupt requests, the 82370 
PIC issues an interrupt signal to the 80376. When 
the 80376 host processor responds with an interrupt 
acknowledge signal, the PIC will arbitrate between 
the pending interrupt requests and place the inter- 
rupt vector associated with the highest priority pend- 
ing request on the data bus. 


The major enhancement in the 82370 PIC over the 
82C59A is that each of the interrupt request inputs 
can be individually programmed with its own inter- 
rupt vector, allowing more flexibility in interrupt vec- 
tor mapping. _ ! 


4.1.1 INTERNAL BLOCK DIAGRAM 


The block diagram of the 82370 Programmable In- 
terrupt Controllei is shown in Figure 4-1. Internally, 
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the PIC consists of three 82C59A banks: A, B and C. 
The three banks are cascaded to one another: C is 


cascaded to B, B is cascaded to A. The INT output — 


of Bank A is used externally to interrupt the 80376. 


Bank A has nine interrupt request inputs (two are 
unused), and Banks B and C have eight interrupt 
request inputs. Of the fifteen external interrupt re- 
quest inputs, two are shared by other functions. Spe- 
cifically, the Interrupt Request 3 input (IRQ3#) can 
be used as the Timer 2 output (TOUT2#). This pin 
can be used in three different ways: IRQ3# input 


‘only, TOUT2# output only, or using TOUT2# to © 


generate an IRQ3# interrupt request. Also, the In- 
terrupt Request 9 input (IRQ9#) can be used as 
DMA Request 4 input (DREQ 4). Typically, only 
IRQ9# or DREQ4 can be used at a time. 


IRQ16# 
IRQ17# 

“1RQ18# 
IRQ19# 
IRQ204: 
IRQ21# 
IRQ22# 
IRQ23# 


| oye eces) 
DREQ4/IRQ9# 
"(IRQ10¢) > 
IRQ11# 
IRQ12¢ — 
IRQ134 
IRQ14¢ —— 
IRQ154 


NOOR GND = © 


TOUT3# (IRQO#) —+-—} 
CHAINING (IRQ1#) ——}-—} 
ICW2 (IRQ 1.54) ——-—>} 
| _ (IRQ2#) 
TOUT2#/IRO3# 
SW Req TC (IRQ4#) 
_ NOT USED= 
NOT USED 
DEFAULT (IRO7#)————> 


INTERRUPT 


NOOO GN - © 


INTERRUPT 


0 
1 
1 


INTERRUPT 


NOUS Wh 


82370 


4.1.2 INTERRUPT CONTROLLER BANKS 


All three banks are identical, with the exception of 
the IRQ1.5 on Bank A. Therefore, only one bank will 
be discussed. In the 82370 PIC, all external requests 
can be cascaded into and each interrupt controller 
bank behaves like a master. AS compared to the 
82C59A, the enhancements in the banks are: - 


— All interrupt vectors are individually programma- 
ble. (In the 82C59A, the vectors must be pro- 
grammed in eight consecutive interrupt vector lo- 
cations.) | 4 


— The cascade address is provided on the Data 
Bus (D0-D7). (In the 82C59A, three dedicated 
control signals (CASO, CAS1, CAS2) are used for 
master/slave cascading.) 


BANK — 
C 


BANK 
B 


BANK > INT 
A (OUTPUT) 


290164-48 


Figure 4-1. Interrupt Controller Block Diagram | z 
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The block diagram of a bank is shown in Figure 4-2. 
As can be seen from this figure, the bank consists of 
six major blocks: the Interrupt Request Register 
(IRR), the In-Service Register (ISR), the Interrupt 
Mask Register (IMR), the Priority Resolver (PR), the 
Vector Registers (VR), and the Control Logic. The 
functional description of each block is included be- 
low. 


INTERRUPT REQUEST (IRR) AND 
IN-SERVICE REGISTER (ISR) 


The interrupts at the Interrupt Request (IRQ) input 
lines are handled by two registers in cascade, the 
Interrupt Request Register (IRR) and the In-Service 
Register (ISR). The IRR is used to store all interrupt 
levels which are requesting service; and the ISR is 
used to store all interrupt levels which are being 
serviced. 


PRIORITY RESOLVER (PR) 
This logic block determines the priorities of the bits 
set in the IRR. The highest priority is selected and 


strobed into the corresponding bit of the ISR during 
an Interrupt Acknowledge cycle. | 


82370 


INTERRUPT MASK REGISTER (IMR) 


The IMR stores the bits which mask the interrupt 
lines to be masked (disabled). The IMR operates on 
the IRR. Masking of a higher priority input will not 
affect the interrupt request lines of lower priority. 


VECTOR REGISTERS (VR) 


This block contains a set of Vector Registers, one 
for each interrupt request line, to store the pre-pro- 
grammed interrupt vector number. The correspond- 
ing vector number will be driven onto the Data Bus 
of the 82370 during the Interrupt Acknowledge cy- 
cle. 


CONTROL LOGIC 


The Control Logic coordinates the overall operations 
of the other internal blocks within the same bank. 
This logic will drive the Interrupt Output signal (INT) 
HIGH when one or more unmasked interrupt inputs 
are active (LOW). The INT output signal goes direct- 
ly to the 80376 (in bank A) or to another bank to 
which this bank is cascaded (see Figure 4-1). Also, 


INT. MASK REG. 


INTERRUPT 
TO HOST 


PRIORITY 
RESOLVER 


& 
CONTROL 
LOGIC 


DATA (0-7) 


INDIVIDUALLY PROGRAMMABLE 
VECTOR BANK 


° 
4 
] 
q 
6 
é 
] 
] 
Q 
@ 
i] 
6 
@ 
8 
] 
Q 
L) 
| 
i] 
] 
] 
e 
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82380 ENHANCEMENT OVER THE 82C59A | 
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Figure 4-2. Interrupt Bank Block Diagram 
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this logic will recognize an Interrupt Acknowledge 
cycle (via M/IO#, D/C # and W/R# signals). During 
this bus cycle, the Control Logic will enable the cor- 
responding Vector Register to drive the interrupt 
vector onto the Data Bus. 


In bank A, the Control Logic is also responsible for 
handling the special |CW2 interrupt request. input 
(IRQ1.5). 


4.2 Interface Signals 


4.2.1 INTERRUPT INPUTS 


There are 15 external Interrupt Request inputs and 5 
internal Interrupt Requests. The external request in- 
puts are: IRQ3#, IRQ9#, IRQ11# to IRQ23#. They 
are shown in bold arrows in Figure 4-1. All IRQ in- 
puts are active LOW and they can be programmed 
(via a control bit in the Initialization Command Word 
1 (ICW1)) to be either edge-triggered or level-trig- 
gered. In order to be recognized as a valid interrupt 
request, the interrupt input must be active (LOW) un- 
til the first INTA cycle (see Bus Functional Descrip- 
tion). Note that all 15 external Interrupt Request in- 
puts have weak internal pull-up resistors. 


As mentioned earlier, an 82C59A can be cascaded | 
to each external interrupt input to expand the inter- | 


rupt capacity to a maximum of 120 levels. Also, two 


of the interrupt inputs are dual functions: |RQ3# can — 


be used as Timer 2 output (TOUT2#) and IRQ9# 
can be used as DREQ4 input. IRQ3¥# is a bidirec- 
tional dual function pin. This interrupt request input is 
wired-OR with the output of Timer 2 (TOUT2#). If 
only IRQ3# function is to be used, Timer 2 should 
be programmed so that OUT2 is LOW. Note that 
TOUT2# can also be used to generate an interrupt 
request to IRQ3# input. 


The five internal interrupt requests serve special 
system functions. They are shown in Table 4-1. The 
following paragraphs describe these interrupts. 


Table 4-1. 82370 Internal Interrupt Requests 


Interrupt Request | 


IRQ0# 
IRQ8 # 
IRQ1# 
IRQ4# 
IRQ1.5# 


interrupt Source 


Timer 3 Output (TOUTS) 
Timer 0 Output (TOUTO) 
DMA Chaining Request 
‘DMA Terminal Count 

—ICWe Written 


TIMER 0 AND TIMER 3 INTERRUPT REQUESTS 


IRQ8# and IRQO# interrupt requests are initiated 
by the output of Timers 0. and 3, respectively. Each 
of these requests is generated by an edge-detector 
flip-flop. 


82370 


The flip- -flops are activated by the following condi- 
tions: 


Set — Rising edge of timer output (TOUT); 
Clear — Interrupt acknowledge for this request; OR 


Request is masked (disabled); OR Hard- 
ware Reset. 


CHAINING AND TERMINAL COUNT INTERRUPTS 


These interrupt requests are generated by. the 
82370 DMA Controller. The chaining request 
(IRQ1#) indicates that the DMA Base Register is 
not loaded. The Terminal Count request (IRQ4 #) in- 
dicates that a software DMA request was cleared. 


ICW2 INTERRUPT REQUEST 


Whenever an Initialization Control Word 2 (ICW2) is 
written to a Bank, a special ICW2 interrupt request is 
generated. The interrupt will be cleared when the 
newly programmed ICW2 Register is read. This in- 
terrupt request is in Bank A at level 1.5. This inter- 
rupt request is internally ORed with the Cascaded 
Request from Bank B and is always assigned a high- 
er priority than the Cascaded Request. 


This special interrupt is provided to support compati- 
bility with the original 82C59A. A detailed description 
of this interrupt is discussed in the Programming 
section. 


DEFAULT INTERRUPT (IRQ7 #) 


During an interrupt Acknowledge cycle, if there is no 
active pending request, the PIC will automatically 
generate a default vector. This vector corresponds 
to the IRQ7# vector in bank A. 


4.2.2 INTERRUPT OUTPUT (INT) 


The INT output pin is taken directly from bank A. 
This signal should be tied to the Maskable Interrupt 
Request (INTR) of the 80376. When this signal is 
active (HIGH), it indicates that one or more internal/ | 
external interrupt requests are pending. The 80376 
is expected to respond with an interrupt acknowl- 
edge cycle. 


4.3 Bus Functional Description 


The INT output of bank A will be activated as a result 
of any unmasked interrupt request. This may be a 
non-cascaded or cascaded request. After the PIC 
has driven the INT signal HIGH, the 80376 will re- 
spond by performing two interrupt acknowledge cy- 


Cles. The timing diagram in Figure 4-3 shows a typi- 
cal interrupt acknowledge process pacer the 


82370 and the 80376 CPU. 
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PREVIOUS INTERRUPT ACKNOWLEDGE 
CYCLE CYCLE 1 (5 WAIT STATES) 


7a el fave a ee a 
CLK J 
M/lO#¥ 
V,VAV,Y, 
D/c# XXXXA A 


TEE 


— im REP: 


Ti 


SEE NOTE 


NOTE: 


*Slave will place a vector at this time. 


Figure 4-3, Interrupt Acknowledge Cycie 


After activating the INT signal, the 82370 monitors 
the status lines (M/IO#, D/C#, W/R#) and waits 
for the 80376 to initiate the first interrupt acknowl- 


edge cycle. In the 80376 environment, two succes-. 


sive interrupt acknowledge cycles (INTA) marked by 
M/lO# =LOW, D/C#=LOW, and W/R#=LOW 
are performed. During the first INTA cycle, the PIC 


will determine the highest priority request. Assuming ~ 


this interrupt input has no external Slave Controller 
cascaded to it, the 82370 will drive the Data Bus 
with OOH in the first INTA cycle. During the second 
INTA cycle, the 82370 PIC will drive the Data Bus 
with the corresponding pre- programmed interrupt 
vector. 


If the PIC determines (from the ICW3) that this inter- 
rupt input has an external Slave Controller cascaded 
to it, it will drive the Data Bus with the specific Slave 
Cascade Address (instead of OOH) during the first 
INTA cycle. This Slave Cascade Address is the pre- 
programmed content in the corresponding Vector 
Register. This means that no Slave Address should 
be chosen to be OOH. Note that the Slave Address 
and Interrupt Vector are different interpretations of 
the same thing. They are both the contents of the 
programmable Vector Register. During the second 
INTA cycle, the Data Bus will be floated so that the 
external Slave Controller can drive its interrupt vec- 
tor on the bus. Since the Slave Interrupt Controller 
resides on the system bus, bus transceiver enable 
and direction control logic must take this into consid- 
eration. 


IDLE 
(4 BUS STATES) 


OKO 


AXX) 


KXXXXX 


What is actually driven on the Data Bus depends on if the current interrupt request is a Slave Request. 


INTA Cycle 1 INTA Cycle 2 
NON-SLAVE REQUEST OOH Vector 
SLAVE REQUEST Slave Address High Impedence* 


82370 


INTERRUPT ACKNOWLEDGE 
CYCLE 2 BS WAIT foe 


V, y, 
iidaid AXX 


XX ar 


Ti Ti T1 


| | SEE NOTE 
| | 
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In order to have a successful interrupt service, the 
interrupt request input must be held valid (LOW) until 
the beginning of the first interrupt acknowledge cy- 
cle. If there is no pending interrupt request when the 
first INTA cycle is generated, the PIC will generate a 


_ default vector, which is the IRQ7 vector (Bank A, 


level 7). 


According to the Bus Cycle definition of the 80376, 
there will be four Bus Idle States between the two 
interrupt acknowledge cycles. These idle bus cycles 
will be initiated by the 80376. Also, during each inter- 
rupt acknowledge cycle, the internal Wait State Gen- 
erator of the 82370 will automatically generate the 
required number of wait states for internal delays. 


4.4 Modes of Operation 


A variety of modes and commands are available for 
controlling the 82370 PIC. All of them are program- 
mable; that is, they may be changed dynamically un- 
der software control. In fact, each bank can be pro- 
grammed individually to operate in different modes. 
With these modes and commands, many possible 
configurations are conceivable, giving the user 
enough versatility for almost any interrupt controlled 
application. 


This section is not intended to show how the 82370 
PIC can be programmed. Rather, it describes the 
operation in different modes. 
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4.4.1 END-OF-INTERRUPT 


Upon completion of an interrupt service routine, the 
interrupted bank needs to be notified so its ISR can 
be updated. This allows the PIC to keep track of 
which interrupt levels are in the process of being 


serviced and their relative priorities. Three different | 


End-Of-Interrupt (EOI) formats are available. They 
are: Non-Specific EO| Command, Specific EO| Com- 
mand, and Automatic EOI Mode. Selection of which 
EOI to use is dependent upon the interrupt opera- 
tions the user wishes to perform. 


If the 82370 is NOT programmed in the Automatic 
EO! Mode, an EOI command must be issued by the 
80376 to the specific 82370 PIC Controller Bank. 
Also, if this controller bank is cascaded to another 
internal bank, an EOI command must also be sent to 
the bank to which this bank is cascaded. For exam- 
ple, if an interrupt request of Bank C in the 82370 
PIC is serviced, an EOI should be written into Bank 
C, Bank B and Bank A. If the request comes from an 
external interrupt controller cascaded to Bank C, 
then an EOI should be written into the external con- 
troller as well. 


NON-SPECIFIC EOI COMMAND 


A Non-Specific EO! command sent from the 80376 
lets the 82370 PIC bank know when a service rou- 
tine has been completed, without specification of its 
exact interrupt level. The respective interrupt bank 
- automatically determines the interrupt level and re- 
sets the correct bit in the ISR. 


To take advantage of the Non-Specific EOI, the in- | 


terrupt bank must be in a mode of operation in which 
it can predetermine its in-service routine levels. For 
this reason, the Non-Specific EO] command should 
only be used when the most recent level acknowl- 
edged and serviced is always the highest priority lev- 
el (i.e. in the Fully Nested Mode structure to be de- 
scribed below). When the interrupt bank receives a 
Non-Specific EOI command, it simply resets the 
highest priority ISR bit to indicate that the highest 
priority routine in service is finished. 


Special consideration should be taken when decid- 
ing to use the Non-Specific EOI command. Here are 
two operating conditions in which it is best NOT 

used since the Fully Nested Mode structure will be 
destroyed: 


— Using the Set Priority command within an inter- 
rupt service routine. | 


— Using a Special Mask Mode. 


These conditions are covered in more detail in their 
own sections, but are listed here for reference. 
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SPECIFIC EOI COMMAND - 


Unlike a Non-Specific EOI command which automat- 
ically resets the highest priority ISR bit, a Specific 
EOI command specifies an exact ISR bit to be reset. 
Any one of the IRQ levels of an interrupt bank can 
be specified in the command. 


The Specific EO! command is needed to reset the 


ISR bit of a completed service routine whenever the 
interrupt bank is not able to automatically determine 


it. The Specific EO] command can be used in all 


conditions of operation, including those that prohibit 
Non-Specific EO! command usage mentioned 
above. , 


AUTOMATIC EOI MODE 


When programmed in the Automatic EOI Mode, the 
80376 no longer needs to issue a command to notify 
the interrupt bank it has completed an interrupt rou- 
tine. The interrupt bank accomplishes this by per- 
forming a Non-Specific EO! automatically at the end 
of the second INTA cycle. 


Special consideration should be taken when decid- 
ing to use the Automatic EOI Mode because it may 
disturb the Fully Nested Mode structure. In the Auto- 
matic EOI Mode, the ISR bit of a routine in service is 
reset right after it is acknowledged, thus leaving no 
designation in the ISR that a service routine is being 
executed. If any interrupt request within the same 
bank occurs during this time and interrupts are en- 
abled, it will get serviced regardless of its priority. 
Therefore, when using this mode, the 80376 should 
keep its interrupt request input disabled during exe- 
cution of a service routine. By doing this, higher pri- 
ority interrupt levels will be serviced only after the 
completion of a routine in service. This guideline re-_ 
stores the Fully Nested Mode structure. However, in | 
this scheme, a routine in service cannot be interrupt- 
ed since the host’s interrupt request input is dis- 
abled. 


4.4.2 INTERRUPT PRIORITIES | 


The 82370 PIC. provides various methods for arrang- 
ing the interrupt priorities of the interrupt request in- 
puts to suit different applications. The following sub- 
sections explain these methods in detail. 


4.4.2.1 Fully Nested Mode 


The Fully Nested Mode of operation is a general pur- 
pose priority mode. This mode supports a multi-level 
interrupt structure in which all of the Interrupt Re- 
quest (IRQ) inputs within one bank are arranged 
from highest to lowest. 


x 
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Unless otherwise programmed, the Fully Nested 
Mode is entered by default upon initialization. At this 
time, IRQ0# is assigned the highest priority 
(priority=0) and IRQ7# the lowest (priority = 7). 
This default priority can be changed, as will be ex- 
plained later in the Rotating Priority Mode. 


When an interrupt is acknowledged, the highest pri- 
ority request is determined from the Interrupt Re- 
quest Register (IRR) and its vector is placed on the 
bus. In addition, the corresponding bit in the In-Serv- 
ice Register (ISR) is set to designate the routine in 
service. This ISR bit will remain set until the 80376 
issues an End Of Interrupt (EOI) command immedi- 
ately before returning from the service routine; or 
alternately, if the Automatic End Of Interrupt (AEOI) 
bit is set, the ISR bit will be reset at the end of the 
second INTA cycle. 


While the ISR bit is set, all further interrupts of the 
same or lower priority are inhibited. Higher level in- 
terrupts can still generate an interrupt, which will be 
acknowledged only if the 80376 internal interrupt en- 
able flip-flop has been reenabled (through software 
inside the current service routine). 


4.4.2.2 Automatic Rotation- aeaae, Priority 
Devices 


Automatic rotation of priorities serves in applications 
where the interrupting devices are of equal priority 


IS7 IS6 ISS S84 


isk status | 0 | 1 | 0 | 1 | 
PRIORITY GRRSERES 


LOWEST PRIORITY - 


IS7  IS6 «ISS 1S4 


isr status [0 [ 1 [0 |. 


PRIORITY poet 


HIGHEST PRIORITY 


Figure 4-4. Rotate On Non-Specific EOI Command 


within an interrupt bank. In this kind of environment, 
once a device is serviced, all other equal priority pe- 
ripherals should be given a chance to be serviced 
before the original device is serviced again. This is 
accomplished by automatically assigning a device 
the lowest priority after being serviced. Thus, in the 
worst case, the device would have to wait until all 
other peripherals connected to the same bank are 
serviced before it is serviced again. 


There are two methods of accomplishing automatic 
rotation. One is used in conjunction with the Non- 
Specific EOI command and the other is used with 
the Automatic EOI mode. These two methods are 
discussed below. 


ROTATE ON NON-SPECIFIC EOI COMMAND 


When the Rotate On Non-Specific EOI command is 
issued, the highest ISR bit is reset as in a normal 
Non-Specific EOI| command. However, after it is re- 
set, the corresponding Interrupt Request (IRQ) level 
is assigned the lowest priority. Other IRQ priorities 
rotate to conform to the Fully Nested Mode based 


on the newly assigned low priority. 


Figure 4-4 shows how the Rotate On Non-Specific 
EOI command affects the interrupt priorities. As- 
sume the IRQ priorities were assigned with IRQO the 
highest and IRQ7 the lowest. IRQ6 and IRQ4 are 


IS3 1S2 IS1 {SO 


(BEFORE 
COMMAND) 


HIGHEST PRIORITY 


290164-50 
IS3 IS2 1S1 ISO 
axe (AFTER 
| 4 | 3 | COMMAND) 
LOWEST PRIORITY 
290164-51 
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already in service but neither is completed. Being 
the higher priority routine, IRQ4 is necessarily the 
routine being executed. During the IRQ4 routine, ‘a 
rotate on Non-Specific EO! command is executed. 
When this happens, Bit 4 in the ISR is reset. IRQ4 
then becomes the lowest priority and IRQ5 becomes 
the nignhest | 


ROTATE ON AUTOMATIC EO! MODE 


The Rotate On Automatic EOI Mode works much 
like the Rotate On Non-Specific EO| Command. The 
main difference is that priority rotation is done auto- 
matically after the second INTA cycle of an interrupt 
request. To enter or exit this mode, a Rotate-On-Au- 
‘tomatic-EOI Set Command and Rotate-On-Automat- 
ic-EOI Clear Command is provided. After this mode 
is entered, no other commands are needed as in the 
normal Automatic EOI Mode. However, it must be 
noted again that when using any form of the Auto- 
matic EOI Mode, special consideration should be 
taken. The guideline presented in the Automatic EOI 
Mode’ also applies here. 


4.4.2.3 Specific Rotation-Specific Priority — 


Specific rotation gives the user versatile capabilities . 


in interrupt controlled operations. It serves in those 
applications in which a specific device’s interrupt pri- 
ority must be altered. As opposed to Automatic Ro- 


~ tation which will automatically set priorities after 


each interrupt request is serviced, specific rotation is 


< _ completely user controlled. That is, the user selects 
which interrupt level is to receive the lowest or the 


highest priority. This can be done during the main 
program or within interrupt routines. Two specific ro- 
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tation commands are available to the user: Set Prior- 
ity Command and Rotate On ppc EOI! Com- 
mand. 


‘SET PRIORITY C COMMAND | 


The Set Priority Command allows the programmer to 
assign an IRQ level the lowest priority. All other in- 
terrupt levels will conform to the Fully Nested Mode 
based on the newly assigned low priority. 


ROTATE ON SPECIFIC EOI COMMAND 


The Rotate On Specific EO] Command is iteralty a 
combination of the Set Priority Command and the 
Specific EO! Command. Like the Set Priority Com- 
mand, a specified IRQ level is assigned lowest priori- 
ty. Like the Specific EOI Command, a specified level 
will be reset in the ISR. Thus, this command accom- 
plishes both tasks in one single command. 


4.4.2.4 Interrupt Priority Mode seas | 

In order to simplify understanding the many modes 
of interrupt priority, Table 4-2 is provided to bring out 
their summary of operations. 

4.4.3 INTERRUPT MASKING 

VIA INTERRUPT MASK REGISTER 


Each bank in the 82370 PIC has an Interrupt Mask 
Register (IMR) which enhances interrupt control ca- 


Table 4-2. interrupt Priority Mode Sanaa 


Interrupt. 
Priority 
Mode | 


Operation 
Summary _ 


Fully-Nested Mode IRQO # - Highest Priority | No change in priority. 


Effect On Priority After EOI 


‘Non-Specific/Automatic 


Not Applicable. 


IRQ7 # - Lowest Priority | Highest ISR bit is reset. 


| Interrupt level just 
serviced is the lowest 
priority. 


Automatic Rotation 
(Equal Priority Devices) 


Other priorities rotate to 
| conform to Fully-Nested 


Mode. 


User specifies the 
lowest priority level. 


Specific Rotation 
(Specific Priority Devices) 


Other priorities rotate to 
conform to Fully-Nested 


Mode. 


Highest ISR bit is reset 

and the corresponding 

level becomes the lowest 
priority. — 


| Not Applicable. 


As described under 
“Operation Summary”. 


Not Applicable. 
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pabilities. This IMR allows individual IRQ masking. 
When an IRQ is masked, its interrupt request is dis- 
abled until it is unmasked. Each bit in the 8-bit IMR 
disables one interrupt channel if it is set (HIGH). Bit 
0 masks IRQO, Bit 1 masks IRQ1 and so forth. 
Masking an IRQ channel will only disable the corre- 
sponding channel and does not affect the others’ 
operations. 


The IMR acts only on the output of the IRR. That is, 
if an interrupt occurs while its IMR bit is set, this 
request is not “forgotten”. Even with an IRQ input 
masked, it is still possible to set the IRR. Therefore, 
when the IMR bit is reset, an interrupt request to the 
80376 will then be generated, providing that the IRQ 
request remains active. If the IRQ request is re- 
moved before the IMR is reset, the Default Interrupt 
Vector (Bank A, level 7) will be generated during the 
interrupt acknowledge cycle. 


SPECIAL MASK MODE 


In the Fully Nested Mode, all IRQ levels of lower 
priority than the routine in service are inhibited. How- 
ever, in some applications, it may be desirable to let 
a lower priority interrupt request to interrupt the rou- 
tine in service. One method to achieve this is by 
using the Special Mask Mode. Working in conjunc- 
tion with the IMR, the Special Mask Mode enables 
interrupts from ail levels except the level in service. 
This is usually done inside an interrupt service rou- 
tine by masking the level that is in service and then 
issuing the Special Mask Mode Command. Once the 
Special Mask Mode is enabled, it remains in effect 
until it is disabled. 


4.4.4 EDGE OR LEVEL INTERRUPT 
TRIGGERING — 


Each bank in the 82370 PIC can be programmed 
independently for either edge or level sensing for the 


DATA BUS 
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interrupt request signals. Recall that all IRQ inputs 
are active LOW. Therefore, in the edge triggered 
mode, an active edge is defined as an input tran- 
sition from an inactive (HIGH) to active (LOW) state. 
The interrupt input may remain active without gener- 
ating another interrupt. During level triggered mode, 
an interrupt request will be recognized by an active 
(LOW) input, and there is no need for edge detec- 
tion. However, the interrupt request must be re- 
moved before the EOI Command is issued, or the 
80376 must be disabled to prevent a second false 
interrupt from occurring. | 


In either modes, the interrupt request input must be 
active (LOW) during the first INTA cycle in order to 
be recognized. Otherwise, the Default Interrupt Vec- 
tor will be generated at level 7 of Bank A. 


- 4.4.5 INTERRUPT CASCADING 


INTA# 
(FROM BUS CONTROLLER) 


As mentioned previously, the 82370 allows for exter- 
nal Slave interrupt controllers to be cascaded to any 
of its external interrupt request pins. The 82370 PIC 
indicates that an external Slave Controller is to be 
serviced by putting the contents of the Vector Regis- 
ter associated with the’ particular request on the 
80376 Data Bus during the first INTA cycle (instead 
of OOH during a non-slave service). The external log- 
ic should latch the vector on the Data Bus using the 
INTA status signals and use it to select the external 
Slave Controller to be serviced (see Figure 4-5). The 
selected Slave will then respond to the second INTA 
cycle and place its vector on the Data Bus. This 
method requires that if external Slave Controllers 
are used in the system, no vector should be pro- 
grammed to OOH. 


Since the external Slave Cascade Address is provid- 
ed on the Data Bus during INTA cycle 1, an external 
latch is required to capture this address for the Slave 
Controller. A simple scheme is depicted in Figure 
4-5 below. | 


POSITIVE 
EDGE 
MASTER/SLAVE 
FLIP=FLOP 


OUT 


CAS(0 = 7) 
TO SLAVE 
8259's 


IN 
CLK 


~\VS- 


LATCH HERE 
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Figure 4-5. Slave Cascade Address Capturing 
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4.4.5.1 Special Fully Nested Mode 


This mode will be used where cascading is em- 


ployed and the priority is to be conserved within - 


each Slave Controller. The Special Fully Nested 
Mode is similar to the “regular” Fully Nested Mode 
with the following exceptions: 


—— When an interrupt request from a Slave Control- 
ler is in service, this Slave Controller is not 
locked out from the Master’s priority logic. Fur- 
ther interrupt requests from the higher priority 
logic within the Slave Controller will be recog- 
nized by the 82370 PIC and will initiate interrupts 
to the 80376. In comparing to the “regular” Fully 
Nested Mode, the Slave Controller is masked out 
when its request is in service and no higher re- 
quests from the same Slave Controller can be 
serviced. 


-—— Before exiting the interrupt service routine, the 
software has to check whether the interrupt serv- 
iced was the only request from the Slave Con- 
troller. This is done by sending a Non-Specific 

~ EO! Command to the Slave Controller and then 
reading its In Service Register. If there are no 
requests in the Slave Controller, a Non-Specific 

_ EOl can be sent to the corresponding 82370 PIC 
bank also. Otherwise, no EOI should be sent. 


4.4.6 READING INTERRUPT STATUS 


The 82370 PIC provides several ways to read differ- 


ent status of each interrupt bank for more flexible 
interrupt control operations. These include polling 
the highest priority pending interrupt request and 
reading the contents of different interrupt status reg- 
_ isters. 


4.4.6.1 Poll Command 


The 82370 PIC supports status polling operations 
with the Poll Command. In a Poll Command, the 
pending interrupt request with the highest priority 
can be determined. To use this command, the INT 


output is not used, or the 80376 interrupt is disabled.. 
Service to devices is achieved py software using the | 


Poll Command. 
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This mode is useful if there is a routine command 


~ common to several levels so that the INTA se- 


quence is not needed. Another application is to use 
the Poll Command to expand the number of Priority 
levels. 


Notice that the ICW2 mechanism is not supported 
for the Poll Command. However, if the Poll Com- 
mand is used, the programmable Vector Registers 
are of no concern since no INTA cycle will be gener- 
ated. 


4.4.6.2 Reading Interrupt Registers _ 


The contents of each interrupt register (IRR, ISR, 
and IMR) can be read to update the user’s program 
on the present status of the 82370 PIC. This can be 
a versatile tool in the decision making process of a 
service routine, giving the user more Control over 
interrupt operations. 


-The reading of the IRR and ISR contents can be 


performed via the Operation Control Word 3 by us- 
ing a Read Status Register Command and the con- 
tent of IMR can be read via a simple read operation 
of the register itself. 


4.5 Register Set Overview 


Each bank of the 82370 PIC consists of a set of 8-bit 
registers to control its operations. The address map 
of all the registers is shown in Table 4-3 below. 
Since all three register sets are identical in functions, 
only one set will be described. 


Functionally, each register set can be divided into 
five groups. They are: the four Initialization Com- 
mand Words (ICW’s), the three Operation Control . 
Words (OCW’s), the Poll/Interrupt Request/In-Serv- 
ice Register, the Interrupt Mask Register, and the 
Vector Registers. A description of each group fol- 


lows. 
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Table 4-3. Interrupt Controller Register Address Map 


Port me 


Write 
Read 


Read/Write 


Write 
Read 


Write - 
Read 


Write Bank B ICW2, ICW3, ICW4, OCW1 
Read Bank B Mask Register 

Read Bank B ICW2 

Read/Write IRQ8 Vector Register 

Read/Write IRQ9 Vector Register 

Read/Write Reserved 

Read/Write IRQ11 Vector Register 
Read/Write IRQ12 Vector Register 
Read/Write IRQ13 Vector Register 
Read/Write IRQ14 Vector Register 


Status Register 

Write Bank C ICW2, ICW3, ICW4, OCW1 
Read Bank C Mask Register 
Read Bank C ICW2 
Read/Write IRQ16 Vector Register 
Read/Write IRQ17 Vector Register 
Read/Write IRQ18 Vector Register 
Read/Write IRQ19 Vector Register 
Read/Write IRQ20 Vector Register 
Read/Write IRQ21 Vector Register 
Read/Write © IRQ22 Vector Register 
Read/Write IRQ23 Vector Register 


Status Register 
Write Bank A ICW2, ICW3S, ICW4, OCW1 
Read Bank A Mask Register 
Read: Bank ICW2 
Read/Write IRQO Vector Register 
Read/Write IRQ1 Vector Register 
Read/Write IRQ1.5 Vector Register 
Read/Write IRQ3 Vector Register 
Read/Write IRQ4 Vector Register 
Read/Write Reserved 
Read/Write Reserved 

IRQ7 Vector Register 


Read/ Write . 


Bank B ICW1, OCW2, or OCW3 
Bank B Poll, Request or In-Service 
Status Register 


IRQ15 Vector Register 


Bank C ICW1, OCW2, or OCW3 
Bank .C Poll, Request or In-Service 


Bank A ICW1, OCW2, or OCW3 
Bank A Poll, Request or In-Service 
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4.5.1 INITIALIZATION COMMAND WORDS (ICW) 


Before normal operation can begin, the 82370 PIC 
must be brought to a known state. There are four 
8-bit Initialization Command Words in each interrupt 
bank to setup the necessary conditions and modes 
for proper operation. Except for the second com- 
mand word (ICW2) which is a read/write register, the 
other three are write-only registers. Without going 
into detail of the bit definitions of the command 
words, the following subsections give a brief de- 
scription of what functions each command word 
controls. 


icw1 


The ICW1 has three major functions. They are: 


— To select between the two IRQ input triggering 
modes (edge- or level-triggered); 


— To designate whether or not the interrupt bank is 
to be used alone or in the cascade mode. If the 
cascade mode is desired, the interrupt bank will 
accept ICW3 for further cascade mode program- 
ming. Otherwise, no ICW3 will be accepted; 


— Todetermine whether or not ICW4 will be issued; 
that is, if any of the ICW4 operations are to be 
used. 


iCW2 


ICW2 is provided for compatibility with the 82C59A 
only. Its contents do not affect the operation of the 
interrupt bank in any way. Whenever the |CW2 of 
any of the three banks is written into, an interrupt is 
generated from bank A at level 1.5. The interrupt 
request will be cleared after the ICW2 register has 
been read by the 80376. The user is expected to 
program the corresponding vector register or to use 
it as an indicator that an attempt was made to alter 
the contents. Note that each ICW2 register has dif- 
ferent addresses for read and write operations. 


ICW3 


The interrupt bank will only accept an ICW3 if pro- 
grammed in the external cascade mode (as indicat- 
ed in ICW1). ICWS3 is used for specific programming 
within the cascade mode. The bits in ICW3 indicate 
which interrupt request inputs have a Slave cascad- 
ed to them. This will subsequently affect the inter- 
rupt vector generation during the interrupt acknowl- 
edge cycles as described previously. 
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ICw4 


The ICW4 is accepted only if it was selected in 
ICW1. This command word register serves two func- 
tions: 


_-- To select either the Automatic EO! mode or soft- 


ware EOI mode; 


— To select if the Special Nested mode is to be 


used in conjunction with the cascade mode. 


4.5.2 OPERATION CONTROL WORDS (OCW) 


Once initialized by the ICW's, the interrupt banks will 
be operating in the Fully Nested Mode by default 
and they are ready to accept interrupt requests. 
However, the operations of each interrupt bank can 
be further controlled or modified by the use of 
OCW’s. Three OCW’s are available for programming 
various modes and commands. Note that all OCW’s 
are 8-bit write-only registers. 


The modes and operations controlled by the OCW’s 
are: 


— Fully Nested Mode; 

— Rotating Priority Mode; 

— Special Mask Mode; 

— Poll Mode; 

—- EOI Commands; 

— Read Status Commands. | 


OcW1 


OCW1 is used solely for masking operations. It pro- 
vides a direct link to the Internal Mask Register 
(IMR). The 80376 can write to this OCW register to 
enable or disable the interrupt inputs. Reading the 
pre-programmed mask can be done via the Interrupt 
Mask Register which will be discussed shortly. 


OCW2 


OCW2 is used to select End-Of-!nterrupt, Automatic 
Priority Rotation, and Specific Priority Rotation oper- 
ations. Associated commands. and modes of these 
operations are selected using the different combina- 
tions of bits in OCW2. 


Specifically, the OCW2 is used to: 


— Designate an interrupt level (0-7) to be used to 
reset a specific ISR bit or to set a specific priori- 
ty. This function can be enabled or disabled; 


— Select which software EOI command (if any) is to 
be executed (i.e. Non-Specific or Specific EOI); 


-—— Enable one of the priority rotation operations (i.e. 
Rotate On Non-Specific EOI, Rotate On Auto- 
matic EOI, or Rotate On Specific EOI). 
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There are three main categories of operation that 
OCWS3 controls. They are summarized as follows: 


— To select and execute the Read Status Register 
Commands, either reading the Interrupt Request 
Register (IRR) or the In-Service Register (ISR); 


— To issue the Poll Command. The Poll Command 
will override a Read Register Command if both 
functions are enabled simultaneously; 


— To set or reset the Special Mask Mode. 


4.5.3 POLL/INTERRUPT REQUEST/IN-SERVICE 
STATUS REGISTER 


As the name implies, this 8-bit read-only register has 
multiple functions. Depending on the command is- 
sued in the OCW3, the content of this register re- 
flects the result of the command executed. For a 
Poll Command, the register read contains the binary 
code of the highest priority level requesting service 
(if any). For a Read IRR Command, the register con- 
tent will show the current pending interrupt re- 
quest(s). Finally, for a Read ISR Command, this reg- 
ister will specify all interrupt levels which are being 
serviced. 


4.5.4 INTERRUPT MASK REGISTER (IMR) 


This is a read-only 8-bit register which, when read, 
will specify all interrupt leveis within the same bank 
that are masked. 


4.5.5 VECTOR REGISTERS (VR) 


Each interrupt request input has an 8-bit read/write 
programmable vector register associated with it. The 
registers should be programmed to contain the inter- 
rupt vector for the corresponding request. The con- 
tents of the Vector Register will be placed on the 
Data Bus during the INTA cycles as described previ- 
ously. 
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4.6 Programming 


Programming the 82370 PIC is accomplished by us- 
ing two types of command words: ICW’s and 
OCW’s. All modes and commands explained in the 
previous sections are programmable using the 
ICW’s and OCW’s. The ICW’s are issued from the 
80376 in a sequential format and are used to setup 
the banks in the 82370 PIC in an initial state of oper- 
ation. The OCW’s are issued as needed to vary and 
control the 82370 PIC’s operations. 


Both ICW’s and OCW’s are sent by the 80376 to the 
interrupt banks via the Data Bus. Each bank distin- 
guishes between the different ICW’s and OCW’s by 
the I/O address map, the sequence they are issued 
(ICW’s only), and by some dedicated bits among the 
ICW’s and OCW’s. 


An example of programming the 82370 interrupt 
controllers is given in Appendix C (Programming the 
82370 Interrupt Controllers). 


All three interrupt banks are programmed in a similar 
way. Therefore, only a single bank will be described 
in the following sections. 


4.6.1 INITIALIZATION (ICW) 


Before normal operation can begin, each bank must 
be initialized by programming a sequence of ‘two to 
four bytes written into the ICW’s. 


Figure 4-6 shows the initialization flow for an inter- 
rupt bank. Both ICW1 and ICW2 must be issued for 
any form of operation. However, ICW3 and iCW4 are 
used only if designated in ICW1. Once initialized, if 
any programming changes within the ICW’s are to 
be made, the entire ICW sequence must be repro- 
grammed, not just an individual ICW. 


Note that although the iCW2’s in the 82370 PIC do 
not effect the Bank’s operation, they still must be 
programmed in order to preserve the compatibility 


_with the 82C59A. The contents programmed are not 


relevant to the overall operations of the interrupt 
banks. Also, whenever one of the three ICW2’s is 
programmed, an interrupt level 1.5 in Bank A will be 
generated. This interrupt request will be cleared » 
upon reading of the |CW2 registers. Since the three 
ICW2’s share the same interrupt level and the sys- 
tem may not know the origin of the interrupt, all three 
ICW2’s must be read. 
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NO (IC4 =0) 


*ICW2 vector address must be programmed now. 


Atom EEC RR ROO RT Es EN 


DISABLE INTERRUPT 


PROGRAM VECTOR(S) * 


N 
YES (IC4=1) 
Icw4 


- ENABLE INTERRUPT 


} READY TO ACCEPT 
INTERRUPT REQUESTS 


Other vector addresses may be programmed via ICW2 interrupt service routine. 
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(ICW2 INTERRUPT GENERATED) 


(ALLOW SERVICING 
OF ICW2 INTERRUPT) 


-290164-53 


Figure 4-6. Initialization Sequence 


Certain internal setup conditions occur automatically 
within the interrupt bank after the first ICW (ICW1) 
has been issued. These are: ns 


— The edge sensitive circuit is reset, which means 
that following initialization; an interrupt request 

_. input must make a HIGH-to-LOW transition to 
generate an interrupt; | 


— The Interrupt Mask Register (IMR) is cleared; 

that is, all interrupt inputs are enabled; 

— IRQ7 input of each bank is assigned priority 7 
(lowest); 


—. Special Mask Mode is cleared and Status Read 
is set to IRR; 


— If no ICW4 is needed, then no Automatic-EOI is 
selected. 


4.6.2 VECTOR REGISTERS (VR) 


Each interrupt request input has a separate Vector 
Register. These Vector Registers are used to store 
the pre-programmed vector number corresponding 
to their interrupt sources. In order to guarantee prop- 
er interrupt handling, all Vector Registers must be 
programmed with the predefined vector numbers. 
Since an interrupt request will be generated whenev- 
er an ICW2 is written during the initialization se- 
quence, it is important that the Vector Register of 
IRQ1.5 in Bank A should be initialized and the inter- 
rupt service routine of this vector is set up before the 
ICW’s are written. 
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4.6.3 OPERATION CONTROL WORDS (OCW) 


After the ICW’s are programmed, the operations of 
each interrupt controller bank can be changed by 
writing into the OCW’s as explained before. There is 
no special programming sequence required for the 
OCW’s. Any OCW may be written at any time in or- 
der to change the mode of or to perform certain op- 
erations on the interrupt banks. | 


4.6.3.1 Read Status and Poll Commands (OCW3) 


Since the reading of IRR and !SR status as well as 
the result of a Poll Command are available on the 
same read-only Status Register, a special Read 
Status/Poll Command must be issued before the 
Poll/Interrupt Request/In-Service Status Register is 
read. This command can be specified by writing the 
required control word into OCWS3. As mentioned ear- 
lier, if both the Poll Command and the Status Read 
Command are enabled simultaneously, the Poll 
Command will override the Status Read. That is, af- 
ter the command execution, the Status Register will 
contain the result of the Poll Command. 


4.7 Register Bit Definition 
INITIALIZATION COMMAND WORD 1 (ICW1) 


D7 D6 DS D4 D3 


O = EDGE TRIGGERED 
1 — LEVEL TRIGGERED 


INITIALIZATION COMMAND WORD 2 (ICW2) 


Tov [oe [os [os] os] oz] or] 00 | 
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Note that for reading IRR and ISR, there is no need 
to issue a Read Status Command to the OCW3 ev- 
ery time the IRR or ISR is to be read. Once a Read 
Status Command is received by the interrupt bank, it 
“remembers” which register is selected. However, 
this is not true when the Poll Command is used. 


In the Poll Command, after the OCW3 is written, the 
82370 PIC treats the next read to the Status Regis- 
ter as an interrupt acknowledge. This will set the ap- 
propriate IS bit if there is a request and read the 
priority level. Interrupt Request input status remains 
unchanged from the Poll Command to the Status 
Read. 4 


In addition to the above read commands, the Inter- 


_rupt Mask Register (IMR) can also be read. When 


read, this register reflects the contents of the pre- 
programmed OCW1 which contains information on 
which interrupt request(s) is(are) currently disabled. 


D2 D1 DO 


O = NO ICW4 NEEDED: 
1 — ICW4 NEEDED 


O = EXTERNAL CASCADE 


(ICW3 NEEDED) © 


1 — NO EXTERNAL CASCADE 


(ICW3 NOT NEEDED) 
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CONTENT IS NOT RELEVANT TO THE ACTUAL 
OPERATION OF THE BANK BUT CAN BE READ 
BY THE INTERRUPT SERVICE ROUTINE TO 
DETERMINE WHERE THE INTERRUPT VECTORS 


OF EACH BANK START. 
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INITIALIZATION COMMAND WORD 3 (ICW3) 
ICW3 for Bank A: | | 


D7 D6 DS D4 D3 D2 DI dO 


po fopotopssfpopojyo} 


0 = NO SLAVE CASCADED TO BANK A 
1 — THERE IS A SLAVE CASCADED | 
~ TO TOUT2#/IRQ3# PIN 
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ICW3 for Bank B: 


D7 D6 D5 D4 D3 D2 .D1 DO 


[si5[si4]sts]si2] sit] x | so] 0 | 


O = NO CASCADED REQUEST TO IRQN 
1 = THERE IS A CASCADED REQUEST 
CONNECTED TO IRQN (I.E. THE 
CORRESPONDING INTERRUPT 
'. REQUEST INPUTS) 
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ICW3 for Bank C: 


[S25] s22 [21 [S20] si] S18] 517 [516 


0 = NO CASCADED REQUEST TO IRQN > 
1 — THERE IS A CASCADED REQUEST 


| CONNECTED TO IRQN 300164<68 


INITIALIZATION COMMAND WORD 4 (icw4) | 
D7 be D5 DA D3 D2 Dt DO 
Lo fo |e Jsrm x | x Patou x 


0 =NORMAL EO! 
1 = AUTOMATIC EO! 


O=NOT SPECIAL FULLY NESTED MODE 
1 =SPECIAL FULLY NESTED MODE 
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OPERATION CONTROL WORD 1 (OCW1) 


D7 D6 D5 D4 D3 D2 D1 DO 
par [oe Doe [oe [os Doe Pw [oe 


Mi=1 MASK SET (INTERRUPT DISABLED) 


Mi=0O MASK RESET (INTERRUPT ENABLED) 
290164-—60 


OPERATION CONTROL WORD 2 (OCW2) | 


| _» INTERRUPT LEVEL 
NON-SPECIFIC EO| COMMAND 


6 a - TO BE ACTED UPON 
0 1 1 SPECIFIC EO! COMMAND | 
1 0 1 ROTATE ON NON-SPECIFIC EO! 
{ 0 0 ROTATE ON AUTO=EO! MODE (SET) 
0. 0 0 ROTATE ON AUTO=EO! MODE (CLEAR) 
1 { 1 ROTATE ON SPECIFIC EO! (L2—LO USED) 
1 1 O SET PRIORITY (L2—LO USED) 
re) 1 O NO OPERATION 
290164-61 
OPERATION CONTROL WORD 3 (OCW3) 
D7 D6 D5 D4 D3 D2 D1 DO 
ESMM SMM — RR_ RIS 
0 0 NO ACTION . 0 0 NO ACTION 
0 1 NO ACTION 1 — POLL COMMAND 0 1. NO ACTION 
1 0 RESET SPECIAL MASK O= NO POLL COMMAND 1 © READ IR REG. 
1 1 SET SPECIAL MASK 1 1 READISREG. 0 


ESMM — Enable Special Mask Mode. When this bit is set to 1, it enables the SMM bit to set or reset the 
Special Mask Mode. When this bit is set to 0, SMM bit becomes don’t care. 


SMM — Special Mask Mode. If ESMM=1 and SMM = 1, the interrupt controller bank will enter Special Mask 
Mode. If ESMM= 1 and SMM =0, the bank will revert to normal mask mode. When ESMM = 0, SMM 
has no effect. | 
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intel 4 82370 
POLL/INTERRUPT REQUEST/IN-SERVICE STATUS REGISTER 


Poll Command Status 


D7 D6 05 D4 D3 D2 DI DO 
aa 


BINARY CODE OF 
THE HIGHEST PRIORITY 
. LEVEL REQUESTING 
O- NO PENDING INTERRUPT. 


1 = PENDING INTERRUPT 
290164-63 


Interrupt Request Status 


D7 D6 DS D4 D3 D2 D1 DO 


LIRQ7 | iRa6 | IRQS | IRO4 | IRQS | IRO2 | IRQ1 | IRQO | 


IF IRQ BIT IS: 0 = NO REQUEST _ 
_ 1 = REQUEST PENDING 


. 290164-64 


NOTE: | 
Although all interrupt Request inputs are active LOW, the internal logical will invert the state of the pins so that when there 
is a pending interrupt request at the input, the corresponding IRQ bit will be set to HIGH in the Interrupt Request Status 
register. . 


In-Service Status 


07 D6 DS D4 DS D2 Di DO 


iF IS BIT IS: 0 = NOT IN=SERVICE 


a. REQUEST IS IN=SERVICE 290164-65 


VECTOR REGISTER (VR) 


8=-BIT VECTOR NUMBER 


290164-—66 
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Table 4-4. Register Operational Summary 


Fully Nested Mode 
Non-specific EO! Command 
Specific EO| Command 
Automatic EOI Mode 
Rotate On Non-Specific 

EO! Command 
Rotate On Automatic 

EOI Mode 
Set Priority Command 
Rotate On Specific 

EO! Command 
| Interrupt Mask Register 
Special Mask Mode 
Level Triggered Mode 
Edge Triggered Mode 
Read Register Command, IRR 
Read Register Command, ISR 
Read IMR 
Poll Command 
Special Fully Nested Mode 


4.8 Register Operational Summary 


For ease of reference, Table 4-4 gives a summary of 
the different operating modes and commands with 
their corresponding registers. 


5.0 PROGRAMMABLE INTERVAL 
TIMER 


9.1 Functional Description 


The 82370 contains four independently Programma- 
ble Interval Timers: Timer 0-3. All four timers are 
functionally compatible to the Intel 82054. The first 
three timers (Timer 0-2) have specific functions. 
The fourth timer, Timer 3, is a general purpose timer. 
Table 5-1 depicts the functions of each timer. A brief 
description of each timer’s function follows. 


Table 5-1. Programmable 
interval Timer Functions 


Event Based IRQ8 Generato 
TOUT1/REF # |Gen. Purpose/DRAM 
Refresh Req. 


2 |TOUT2/IRQ3 #|Gen. Purpose/Speaker 
Out/IRQ3 # 


Gen. Purpose/iRQO 
Generator 


Operational Command 
Description Words 


OCW-Default 


OCW2 EOI 

OCW2 SL, EOI, LO—-L2 
ICW1, ICW4 IC4, AEOI 

OCW2 EOI 


OCW MO—M7 
OCW3 ESMM, SMM 
ICW1 LTIM 
ICW1 LTIM 
OCW3 RR, RIS 
OCW3 “RR, RIS 

IMR MO-M7 
OCW3 e 
ICW1, ICW4 IC4, SFNM 


OCW2 R, SL, EOI 


LO-L2 
R, SL, EO! 


OCW2 
OCW2 


TIMER 0—Event Based Interrupt Request 8 
Generator 


Timer 0 is intended to be used as an Event Counter. 
The output of this timer will generate an Interrupt 
Request 8 (IRQ8) upon a rising edge of the timer 
output (TOUTO). Normally, this timer is used to im- 
plement a time-of-day clock or system tick. The Tim- 
er 0 output is not available as an external signal. 


TIMER 1—General Purpose/DRAM Refresh 
Request 


The output of Timer 1, TOUT1, can be used as a 
general purpose timer or as a DRAM Refresh Re- 
quest signal. The rising edge of this output creates a 
DRAM refresh request to the 82370 DRAM Refresh 
Controller. Upon reset, the Refresh Request func- 
tion is disabled, and the output pin is the Timer 1 
output. | 


TIMER 2—General Purpose/Speaker Out/IRQ3 # 


The Timer 2 output, TOUT2#, could be used to sup- 
port tone generation to an external speaker. This pin 
is a bidirectional signal. When used as an input, a 
logic LOW asserted at this pin will generate an Inter- 
rupt Request 3 (IRQ3#) (see Programmable Inter- 
rupt Controller). 
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(INTERNAL) 


ouro |oenge OC 
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Figure 5-1. Block Diagram of Programmable Interval Timer 


TIMER senate Purpose/ inter weavers 0 
Generator : 


The aut of Timer 3 is fed to an edge detector and 
generates an Interrupt Request 0 (IRQO) in the 
82370. The inverted output of this timer (TOUT3 #) 
is also available as an external signal for general 
purpose use. | 


5.1.1 INTERNAL ARCHITECTURE 


The functional block diagram of the Programmable 
Interval Timer section is shown in Figure 5-1. Follow- 
ing is a description of each block. 


DATA BUFFER & READ/WRITE Loaic 


This part of the Programmable Interval Timer is used 
to interface the four timers to the 82370 internal bus. 
The Data Buffer is for transferring commands and 
data between the 8-bit internal bus and the timers. 


The Read/Write Logic accepts inputs from the inter- 
nal bus and generates signals to control other func- 
tional blocks within the timer section. 


CONTROL WORD REGISTERS | & II 


The Control Word Registers are write-only registers. 
They are used to control the operating modes of the 
timers. Control Word Register | controls Timers 0, 1 
and 2, and Control Word Register II controls Timer 
3. Detailed description of the Control Word Regis- 
ters will be included in the negeles Set Overview 
section. 


COUNTER 0, COUNTER 1, COUNTER 2, 
COUNTER 3 


Counters 0, 1, 2, and 3 are the major parts of Timers 


0, 1, 2, and 3, respectively. These four functional 


blocks are identical in operation, so only a single 
counter will be described. The internal block dia- 
gram of one counter is shown in Figure 5-2. 
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Figure 5-2. Internal Block Diagram of a Counter 


The four counters share a common clock input 
(CLKIN), but otherwise are fully independent. Each 
counter is programmable to operate in a different 
mode. 


Although the. Control Word Register is shown in the 
figure, it is not part of the counter itself, Its pro- 
grammed contents are used to control the opera- 
tions of the counters. | 


The Status Register, when latched, contains the cur- 
rent contents of the Control Word Register and 
status of the output and Null Count Flag (see Read 
Back Command). 


The Counting Element (CE) is the actual counter. It 
is a 16-bit presettable synchronous down counter. 


The Output Latches (OL) contain two 8-bit latches 
(OLM and OLL). Normally, these latches “follow” 
the content of the CE. OLM contains the most signif- 
icant byte of the counter and OLL contains the least 
significant byte. If the Counter Latch Command is 
sent to the counter, OL will latch the present count 
until read by the 80376 and then return to follow the 
CE. One latch at a time is enabled by the timer’s 
Control Logic to drive the internal bus. This is how 
the 16-bit Counter communicates over the 8-bit in- 
ternal bus. Note that CE cannot be read. Whenever 
the count is read, it is one of the OL’s that is being 
read. 


When a new count is written into the counter, the 
value will be stored in the Count Registers (CR), and 
transferred to CE. The transferring of the contents 
from CR’s to CE is defined as “loading” of the coun- 
ter. The Count Register contains two 8-bit registers: 
CRM (which contains the most significant byte) and 
CRL (which contains the least significant byte). Simi- 
lar to the OL’s, the Control Logic allows one register 
at a time to be loaded from the 8-bit internal bus. 
However, both bytes are transferred from the CR’s 
to the CE simultaneously. Both CR’s are cleared 
when the Counter is programmed. This way, if the 
Counter has been programmed for one byte count 
(either the most significant or the least significant 
byte only), the other byte will be zero. Note that CE 
cannot be written into directly. Whenever a count is 
written, it is the CR that is being written. 


As shown in the diagram, the Control Logic consists 
of three signals: CLKIN, GATE, and OUT. CLKIN 
and GATE will be discussed in detail in the section 
that follows. OUT is the internal output of the coun- 
ter. The external outputs of some timers (TOUT) are 
the inverted version of OUT (see TOUT1, TOUT2#, 
TOUT3#). The state of OUT depends on the mode 
of operation of the timer. | : 
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5.2 Interface Signals 


5.2.1 CLKIN 


CLKIN is an input signal used by all four timers for 


internal timing reference. This signal can be inde- 


pendent of the 82370 system clock, CLK2. In the 
- following discussion, each “CLK Pulse” is defined 
as the time period between a rising edge and a fall- 
ing edge, in that order, of CLKIN. 


During the rising edge of CLKIN, the state of GATE 
is sampled. All new counts are loaded and counters 
are decremented on the falling edge of CLKIN. 


5.2.2 TOUT1, TOUT2#, TOUT3# 


TOUT1, TOUT2# and TOUT3# are the external 
output signals of Timer 1, Timer 2 and Timer 3, re- 
spectively. TOUT2# and TOUT3# are the inverted 
signals of their respective counter outputs, OUT. 
There is no external output for Timer 0. 


lf Timer 2 is to be used as a tone generator of a 
speaker, external buffering must be used to provide 
sufficient drive capability. 


The Outputs of Timer 2 and 3 are dual function pins. 
The output pin of Timer 2 (TOUT2#/IRQ3 #), which 
_is a bidirectional open-collector signal, can also be 
used as interrupt request input. When the interrupt 
function is enabled (through the Programmable In- 
terrupt Controller), a LOW on this input will generate 
an Interrupt Request 3# to the 82370 Programma- 
ble Interrupt Controller. This pin has a weak internal 
pull-up resistor. To use the IRQ3# function, Timer 2 
should be programmed so that OUT2 is LOW. Addi- 
tionally, OUTS of Timer 3 is connected to an edge 
detector which will generate an Interrupt Request 0 
(IRQO) to the 82370 after the rising edge of OUT3 
(see Figure 5-1). 


5.2.3 GATE 


GATE is not an externally controllable sional: Rath- 
er, it can be software controlled with the Internal 
Control Port. The state of GATE is always sampled 
on the rising edge of CLKIN. Depending on the 
mode of operation, GATE is used to enable/disable 
counting or trigger the start of an operation. 


For Timer 0 and 1, GATE is always enabled (HIGH). 
For Timer 2 and 3, GATE is connected to Bit O and 


6, respectively, of an Internal Control Port (at ad- . 


_ dress 61H) of the 82370. After a hardware reset, the 
state of GATE of Timer 2 and 3 is disabled (LOW). 


- 82370 


5.3 Modes of Operation 


Each timer can be independently programmed to. 
operate in one of six different modes. Timers are 
programmed by writing a Control Word into the Con- 
trol Word Register followed by an Initial Count (see 
Programming). 


The following are defined for use in Porn 
different modes of operation. 


CLK Pulse— A rising edge, then a falling edge, in 
that order, of CLKIN. 
Trigger— A rising edge of a timer’s GATE input. 


Timer/Counter Loading— The transfer of a count 
| from Count Register 
(CR) to Count Element 

(CE). 


5.3.1 MODE 0-INTERRUPT ON TERMINAL 
COUNT 


Mode 0 is typically used for event counting. After the 


_ Control Word is written, OUT is initially LOW, and will 


remain LOW until the counter reaches zero. OUT 
then goes HIGH and remains HIGH until a new 
count or a new Mode 0 Control Word is written into 
the counter. 


In this mode, GATE=HIGH enables counting; 
GATE = LOW disables counting. However, GATE 
has no effect on OUT. 


After the Control Word and initial count are written to 
a timer, the initial count will be loaded on the next 
CLK pulse. This CLK pulse does not decrement the 
count, so for an initial count of N, OUT does not go 
HIGH until N+ 1 CLK pulses after the initial count is 
written. 


If a new count is written to the timer, it will be loaded 
on the next CLK pulse and counting will continue 
from the new count. If a two-byte count is written, 


_ the following happens: 


1. Writing the first byte disables counting, OUT is set 
LOW immediately (i.e. no CLK pulse required). 


2. Writing the second byte allows the new count to 
be loaded on the next CLK pulse. 


This allows the counting sequence to be synchroniz- 
ed by software. Again, OUT does not go HIGH until 


-N+1 CLK pulses after the new count of N is written. 
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| NOTES: 


The following conventions apply to all mode timing diagrams. 
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1. Counters are programmed for binary (not BCD) counting and for reading/writing least significant byte (LSB) only. 


2. The counter is always selected (CS# always low). 


| 3. CW stands for ‘“‘Control Word”; CW = 10 means a control word of 10, Hex is written to the counter. 


| 4. LSB stands for “Least significant byte” of count. 
5. Numbers below diagrams are count values. 
The lower number is the least significant byte. 


The upper number is the most significant byte. Since the counter is programmed to read/write LSB only, the most 


significant byte cannot be read. 
N stands for an undefined count. 
Vertical lines show transitions between count values. 


Figure 5-3. Mode 0 


If an initial count is written while GATE is LOW, the 
counter will be loaded on the next CLK pulse. When 
GATE goes HIGH, OUT will go HIGH N CLK pulses 
later; no CLK pulse is needed to load the counter as 
this has already been done. 


5.3.2 MODE 1-GATE RETRIGGERABLE 
ONE-SHOT 


~ In this mode, OUT will be initially HIGH. OUT will go 
LOW on the CLK pulse following a trigger to start the 


one-shot operation. The OUT signal will then remain 
LOW until the timer reaches zero. At this point, OUT 
will stay HIGH until the next trigger comes in. Since 
the state of GATE signals of Timer 0 and 1 are inter- 
nally set to HIGH. 


After writing the Control Word and initial count, the 


_ timer is considered ‘armed’. A trigger results in 


loading the timer and setting OUT LOW on the next 
CLK pulse. Therefore, an initial count of N will result 
in a one-shot pulse width of N CLK cycles. Note 
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Figure 5-4. Mode 1 


that this one-shot operation is retriggerable; i.e. OUT 
will remain LOW for N CLK pulses after every trigger. 
The one-shot operation can be repeated without re- 
writing the same count into the timer. | , 


If a new count is written to the timer during a one- 


shot operation, the current one-shot pulse width will. 


not be affected unti! the timer is retriggered. This is 
because loading of the new count to CE will occur 
only when the one-shot is triggered. | 


5.3.3 MODE 2-RATE GENERATOR > 
This mode is a divide-by-N counter. It is typically 


used to generate a Real Time Clock interrupt. OUT 
will initially be HIGH. When the initial count has dec- 


remenisd to 1, OUT goes LOW for one CLK pulse, 
then OUT goes HIGH again. Then the timer reloads 
the initial count and the process is repeated. In other 


_ words, this mode is periodic since the same se- 


quence is repeated itself indefinitely. For an initial 
count of N, the eoquenee eae every N CLK cy- 
cles. | 


Similar to Mode 0, GATE= HIGH enables counting, 
where GATE=LOW disables counting. If GATE 
goes LOW during an output pulse (LOW), OUT is set 


HIGH immediately. A trigger (rising edge on GATE) 


will reload the timer with the initial. count on the next 
CLK pulse. Then, OUT will go LOW (for one CLK 
pulse) N CLK pulses after the new trigger. Thus, 
GATE can be used to synchronize the timer. _ 
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A GATE transition should not occur one clock prior to terminal count. 


Figure 5-5. Mode 2 


After writing a Control Word and initial count, the 
timer will be loaded on the next CLK pulse. OUT 
goes LOW (for one CLK pulse) N CLK pulses after 
the initial count is written. This is another way the 
timer may be synchronized by software. 


Writing a new count while counting does not affect 
the current counting sequence because the new 
count will not be loaded until the end of the current 


counting cycie. If a trigger is received after writing a © 


new count but before the end of the current period, 
the timer will be loaded with the new count on the 
next CLK pulse after the trigger, and counting will 
continue with the new count. 


5.3.4 MODE 3-SQUARE WAVE GENERATOR 


Mode 3 is typically used for Baud Rate generation. 
Functionally, this mode is similar to Mode 2 except 
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for the duty cycle of OUT. In this mode, OUT will be 
initially HIGH. When half of the initial count has ex- 
pired, OUT goes low for the remainder of the count. 
The counting sequence will be repeated, thus this 
mode is also periodic. Note that an initial count of N 


results in a square wave with a period of N CLK 


pulses. 


The GATE input can be used to synchronize the tim- 
er. GATE=HIGH enables counting; GATE=LOW 
disables counting. If GATE goes LOW while OUT is 


LOW, OUT is set HIGH immediately (i.e. no CLK | 


pulse is required). A trigger reloads the timer with the 
initial count on the next CLK pulse. 


After writing a Control Word and initial count, the 
timer will be loaded on the next CLK pulse. This al- 
lows the timer to be synchronized by software. 


Writing a new count while counting does not affect 
the current counting sequence. If a trigger is re- 
ceived after writing a new count but before the end 
of the current half-cycle of the square wave, the tim- 
er will be loaded with the new count on the next CLK 
pulse and counting will continue from the new count. 
Otherwise, the new count will be loaded at the end 
of the current half-cycle. | 


There is a slight difference in operation depending 
_ on whether the initial count is EVEN or ODD. The 


following description is to show exactly how this 
mode is implemented. 


_ EVEN COUNTS: 


OUT is initially HIGH. The initial count is loaded on 
one CLK pulse and is decremented by two on suc- 
ceeding CLK pulses. When the count expires (decre- 


~ mented to 2), OUT changes to LOW and the timer is 


reloaded with the initial count. The above process is 
repeated indefinitely. 


ODD COUNTS: 


OUT is initially HIGH. The initial count minus one 
(which is an even number) is loaded on one CLK 
pulse and is decremented by two on succeeding 
CLK pulses. One CLK pulse after the count expires 
(decremented to 2), OUT goes LOW and the timer is 
loaded with the initial count minus one again. Suc- 
ceeding CLK pulses decrement the count by two. 
When the count expires, OUT goes HIGH immedi- 
ately and the timer is reloaded with the initial count 
minus one. The above process is repeated indefi- 
nitely. So for ODD counts, OUT will HIGH or 
(N+ 1)/2 counts and LOW for (N—1)/2 counts. 
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NOTE: % 
A GATE transition should not occur one clock prior to terminal count. 


Figure 5-6. Mode 3 


5.3.5 MODE 4-INITIAL COUNT TRIGGERED 
STROBE | 


This mode allows a strobe pulse to be generated by 
writing an initial count to the timer. Initially, OUT will 
be HIGH. When a new initial count is written into the 
timer, the counting sequence will begin. When the 
initial count expires (decremented to 1), OUT will go 
LOW for one CLK pulse and then go HIGH again. 


Again, GATE=HIGH enables counting while 
GATE = LOW disables counting. GATE has no ef- 
fect on OUT. . 


After writing the Control Word and initial count, the 
timer will be loaded on the next CLK pulse. This CLK 
pulse does not decrement the count, so for an initial 
count of N, OUT does not strobe LOW :until.N+ 1 
CLK pulses after initial count is written. 


lf a new count is written during counting, it will be 
loaded in the next CLK pulse and counting will con- 
tinue from the new count. 
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Figure 5-7. Mode 4 


If a two-byte count is written, the following will occur: 
— 1. Writing the first byte has no effect on counting. 


2. Writing the second byte allows the new count to 
be loaded on the next CLK pulse. 


OUT will strobe LOW N+1 CLK pulses after the 
new count of N is written. Therefore, when the 
strobe pulse will occur after a trigger depends on the 
value of the initial count loaded. 


5.3.6 MODE 5-GATE RETRIGGERABLE 
STROBE 


Mode 5 is very similar to Mode 4 except the count 
sequence is triggered by the gate signal instead of 


by writing an initial count. Initially, OUT will be HIGH. 
Counting is triggered by a rising edge of GATE. 
When the initial count has expired (decremented to 
1), OUT will go LOW for one CLK ai and then go 
HIGH again. = 


After loading the Control Word and initial count, the 
Count Element will not be loaded until the CLK pulse 
after a trigger. This CLK pulse does not decrement 
the count. Therefore, for an initial count of N, OUT 
does not strobe LOW until N+ 1 CLK pulses after a 
trigger. 
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Figure 5-8. Mode 5 


The counting sequence is retriggerable. Every trig- 
ger will result in the timer being loaded with the initial 
count on the next CLK pulse. 


If the new count is written during counting, the cur- 
rent counting sequence will not be affected. If a trig- 
ger occurs after the new count is written but before 
the current count expires, the timer will be loaded 
with the new count on the next CLK pulse and a new 
count sequence will start trom there. 


5.3.7 OPERATION COMMON TO ALL MODES 


5.3.7.1 GATE 


The GATE input is always sampled on the rising 
edge of CLKIN. In Modes 0, 2, 3 and 4, the GATE 
input is ievei sensitive. The logic level is sampled on 
the rising edge of CLKIN. In Modes 1, 2, 3 and 5, the 
GATE input is rising edge sensitive. In these modes, 
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Summary of Gate Operations | 


GATE LOW or Going LOW 
Disable count 
1 No Effect 


20 ' 1. Disable count 
2. Sets output HIGH 
immediately 
3 | 1. Disable count 
: 2. Sets output HIGH 
immediately 
4 Disable count 
5 No Effect 


a rising edge of GATE (trigger) sets an edge sensi- 
tive flip-flop in the timer. The flip-flop is reset imme- 
diately after it is sampled. This way, a trigger will be 
detected no matter when it occurs; i.e. a HIGH logic 
level does not have to be maintained until the next 
rising edge of CLKIN. Note that in Modes 2 and 3, 
the GATE input is both edge and level sensitive. 


5.3.7.2 Counter 


New counts are loaded and counters are decre- 
mented on the falling edge of CLKIN. The largest 
possible initial count is 0.. This is equivalent to 2**16 
for binary counting and 10**4 for BCD counting. 


Note that the counter does not stop when it reaches 
zero. In Modes 0, 1, 4 and 5, the counter ‘wraps 


around’ to the highest count: either FFFF Hex for 


binary counting or 9999 for BCD counting, and con- 

-tinues counting. Modes 2 and 3 are periodic. The 
counter reloads itself with the initial count and con- 
tinues counting from there. 


The minimum and maximum initial count in each 


counter depends on the mode of operation. They | 


are summarized below. 


5.4 Register Set Overview 


The Programmable Interval Timer module of the 
82370 contains a set of six registers. The port ad- 
dress map of these registers is shown in Table 5-2. 


82370 | 


GATE Rising © 
No Effect — 
1. Initiate count 
- 2. Reset output 
_after next clock 
Initiate count 


Enable count 
No Effect 


Enable count 
Enable count 


Initiate count 


No Effect 
Initiate count 


Enable count . 
No Effect 


Table 5-2. Timer Register Port Address Map 


Counter 0 Register (read/write) 
Counter 1 Register (read/write) _ 
- Counter 2 Register (read/write) 
Control Word Register | 
~ (Counter 0, 1 & 2) (write-only) 


Counter 3 Register (read/write) 
Reserved 


Reserved | 
Control Word Register II 
(Counter 3) (write-only) — 


5.4.1 COUNTER 0, 1, 2, 3 REGISTERS 


These four 8-bit registers are functionally identical. 
They are used to write the initial count value into the 
respective timer. Also, they can be used to read the 
latched count value of a timer. Since they are 8-bit 
registers, reading and writing of the 16-bit initial 
count must follow the count format specified in the 
Control Word Registers; i.e. least significant byte 
only, most significant byte only, or least significant 
byte then most significant byte (see Programming). 


5.4.2 CONTROL WORD REGISTER | & Il 


There are two Control Word Registers associated 
with. the Timer section. One of the two registers 
(Control Word Register |) is used to control the oper- 
ations of Counters 0, 1 and 2 and the other (Control 
Word Register II) is for Counter 3. The major func- 
tions of both Control Word Registers are listed be- 
low: 
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— Select the timer to be programmed. 


— Define which mode the selected timer is to oper- 
ate in. 


— Define the count sequence; i.e. if the selected 
timer is to count as a Binary Counter or a Binary 
Coded Decimal (BCD) Counter. 


— Select the byte access sequence during timer 
read/write operations; i.e. least significant byte 
only, most significant only, or least significant 
byte first, then most significant byte. 


Also, the Control Word Registers can be pro- 
grammed to perform a Counter Latch Command or a 
Read Back Command which will be described iater. 


5.5 Programming 


5.5.1 INITIALIZATION 


Upon power-up or reset, the state of all timers is 
undefined. The mode, count value, and output of all 
timers are random. From this point on, how each 
timer operates is determined solely by how it is pro- 
grammed. Each timer must be programmed before it 
can be used. Since the outputs of some timers can 
generate interrupt signals to the 82370, all timers 
should be initialized to a known state. 


Counters are programmed by writing a Control Word 
into their respective Control Word Registers. Then, 
an Initial Count can be written into the correspond- 
ing Count Register. in general, the programming pro- 
cedure is very flexible. Only two conventions need to 
be remembered: 


1. For each timer, the Control Word must be written 
before the initial count is written. 


2. The 16-bit initial count must follow the count for- 
mat specified in the Control Word (least significant 
byte only, most significant byte only, or least signifi- 
cant byte first, followed by most significant byte). 


Since the two Control Word Registers and the four 
Counter Registers have separate addresses, and 
each timer can be individually selected by the appro- 
priate Control Word Register, no special instruction 
sequence is required. Any programming sequence 
that follows the conventions above is acceptable. 


A new initial count may be written to a timer at any 
time without affecting the timer’s programmed mode 
in any way. Count sequence will be affected as de- 
scribed in the Modes of Operation section. Note that 
the new count must follow ithe programmed count 
format. 
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lf a timer is previously programmed to read/write 
two-byte counts, the following precaution applies. A 
program must not transfer control between writing 
the first and second byte to another routine which 
also writes into the same timer. Otherwise, the read/ 
write will result in incorrect count. 


Whenever a Control Word is written to a timer, all 
control logic for that timer(s) is immediately reset 
(i.e. no CLK pulse is required). Also, the correspond- 
ing output in, TOUT #, goes to a known initial state. 


5.5.2 READ OPERATION 


Three methods are available to read the current 
count as weil as the status of each timer. They are: 
Read Counter Registers, Counter Latch Command 
and Read Back Command. Beiow is a description of 
these methods. 


READ COUNTER REGISTERS 


The current count of a timer can be read by perform- 
ing a read operation on the corresponding Counter 
Register. The only restriction of this read operation 
is that the CLKIN of the timers must be inhibited by 
using external logic. Otherwise, the count may be in 
the process of changing when it is read, giving an 
undefined result. Note that since all four timers are 
sharing the same CLKIN signal, inhibiting CLKIN to 


read a timer will unavoidably disable the other timers & 


also. This may prove to be impractical. Therefore, it 
is suggestec that either the Counter Latch Com- | 
mand or the Read Back Command can be used to 
read the current count of a timer. 


Anotner alternative is to temporarily disable a timer 
before reading its Counter Register by using the 
GATE input. Depending on the mode of operation, 
GATE=LOW wili disable the counting operation. 
However, this option is available on Timer 2 and 3 
only, since ihe GATE signals of the other two timers 
are internaliy enabied ail the time. 


‘COUNTER LATCH COMMAND 


A Counter Latch Command will be executed when- 
ever a special Coritrol Word is written into a Control 
Word Register. Two bits written into the Control 
Word Register distinguish this command from a ‘reg- 
ular’ Controi Word (see Register Bit Definition). Also, 
two other bits in the Control Word will select which 
couster is to be jaiched. 


Upon execution of this command, the selected 
counter’s Qutou' Catch (OL) latcnes the count at the 
time the Counter Watch Command is received. This 
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count is held in the latch until it is read by the 80376, 
or until the timer is reprogrammed. The count is then 
unlatched automatically and the OL returns to “fol- 
lowing” the Counting Element (CE). This allows 


reading the contents of the counters ‘‘on the fly’ 


without affecting counting in progress. Multiple 
Counter Latch Commands may be used to latch 
more than one counter. Each latched count is held 
until it is read. Counter Latch Commands do not af- 
fect the programmed mode of the timer in any way. 


lf a counter is latched, and at some time later, it is 
latched again before the prior latched count is read, 
the second Counter Latch Command is ignored. The 
count read will then be the count at the time the first 
command was issued. 


In any event, the latched count must be read ac- 
cording to the programmed format. Specifically, if 
the timer is programmed for two-byte counts, two 
bytes must be read. However, the two bytes do not 
have to be read right after the other. Read/write or 
programming operations of other timers may be per- 
formed between them. | 


Another feature of this Counter Latch Command is 
that read and write operations of the same timer 
may be interleaved. For example, if the timer is pro- 
grammed for two-byte counts, the following se- 
quence is valid. 


1. Read least significant byte. 
2. Write new least significant byte. 
3. Read most significant byte. 
4. Write new most significant byte. 


lf a timer is programmed to read/write two-byte 
counts, the following precaution applies. A program 
must not transfer control between reading the first 


and second byte to another routine which also reads — 


from that same timer. Otherwise, an incorrect count 
will be read. 


READ BACK COMMAND 


The Read Back Command is another special Com- 


mand Word operation which allows the user to read 
the current count value and/or the status of the se- 
lected timer(s). Like the Counter Latch Command, 


two bits in the Command Word identify this as a _ 


Read Back Command (see Register Bit Definition). 


The Read Back Command may be used to latch 
multiple counter Output Latches (OL’s) by selecting 
more than one timer within a Command Word. This 
single command is functionally equivalent to several 

Counter Latch Commands, one for each counter to 
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be latched. Each counter’s latched count will be 
heid until it is read by the 80376 or-until the timer is 
reprogrammed. The counter is automatically un- 
latched when read, but other counters remain 
latched until they are read. If multiple Read Back 
commands are issued to the same timer without 
reading the count, all but the first are ignored; i.e. the 
count read will correspond to the very first Read 
Back Command issued. 


As mentioned previously, the Read Back Command 
may also be used to latch status information of the 
selected timer(s). When this function is enabled, the 
status of a timer can be read from the Counter Reg- 
ister after the Read Back Command is issued. The 
status information of a timer includes the following: 


1..Mode of timer: 


This allows the user to check the mode of opera- 
tion of the timer last programmed. 


2. State of TOUT pin of the timer: 


This allows the user to monitor the counter’s out- 
put pin. via software, possibly eliminating some 
hardware from a system. | 


3. Null Count/Count available: 


The Null Count Bit in the status byte indicates if 
the last count written to the Count Register (CR) 
has been loaded into the Counting Element (CE). 
The exact time this happens depends on the 
mode of the timer and is described in the Pro- 
gramming section. Until the count is loaded into 
the Counting Element (CE), it cannot be read from 
the timer. If the count is latched or read before 
this occurs, the count value will not reflect the 
new count just written. 


lf multiple status latch operations of the timer(s) are 
performed without reading the status, all but the first 
command are ignored; i.e. the status read in will cor- 
respond to the first Read Back Command issued. 


Both the current count and status of the selected 
timer(s) may be latched simultaneously by enabling 
both functions in a single Read Back Command. 
This is functionally the same as issuing two separate 
Read Back Commands at once. Once again, if multi- 
ple read commands are issued to latch both the 
count and status of a timer, all but the first command 
will be ignored. — 


_If both count and status of a timer are latched, the 


first read operation of that timer will return the 
latched status, regardless of which was latched first. 
The next one or two (if two count bytes are to be 
read) read operations return the latched count. Note 
that subsequent read operations on the Counter 
Register will return the unlatched count (like the first 


read method discussed). 


5-1398 


5.6 Register Bit Definitions 


COUNTER 0, 1, 2, 3 REGISTER (READ/WRITE) 


Counter 0 Register (read/write) 
Counter 1 Register (read/write) 


Counter 2 Register (read/write) 
Counter 3 Register (read/write) 
Reserved 
Reserved 


Control! Word Register | 


D7 D4 D3 D2 D1 DO 
[ser] soo] eri wo] 2 | wr [v0 Tomo 


SELECT COUNTER: 


00 SELECT COUNTER 0 
01 SELECT COUNTER 1 
10 SELECT COUNTER 2 
11 READ BACK COMMAND 
FOR COUNTER 0-2 


O- 16=BIT BINARY 
COUNTER 


1=- BCD COUNTER 
(4 DECADES) 


MODE: 
000 MODE 0 
001 MODE 1 
X10 MODE 2 
X11 MODE 3 
100 MODE 4 
101 MODE 5 
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READ/WRITE: 
00 COUNTER LATCH COMMAND 
01 READ/WRITE LSB BYTE ONLY 
10 READ/WRITE MSB BYTE ONLY 
11 READ/WRITE LSB, THEN MSB BYTE 
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Note that these 8-bit registers are for writing and 
reading of one byte of the 16-bit count value, either 
the most significant or the least significant byte. 


CONTROL WORD REGISTER | & I! (WRITE- 
ONLY) 


Port Address 
43H 


Control Word Register | 
(Counter 0, 1, 2 (write-only) 
Control Word Register II 

(Counter 3) (write-only) 


47H 


LSB OF COUNT BYTE 


MSB OF COUNT BYTE 
290164-75 


Control Word Register || 


0- 16-BiIT BINARY 
COUNTER 
1=- BCD COUNTER 
(4 DECADES) 


SELECT COUNTER: 


00 SELECT COUNTER 3 

01 RESERVED 

10 RESERVED 

‘11 READ BACK COMMAND 
FOR COUNTER 3 


MODE: 
000 MODE 0 
001 MODE 1 
X10 MODE 2 
X11 MODE 3 
100 MODE 4 
101 MODE 5 


290164-77 


READ /WRITE: 
00 COUNTER LATCH COMMAND 
01 READ/WRITE LSB BYTE ONLY 
10 READ/WRITE MSB BYTE ONLY 
11 READ/WRITE LSB, THEN MSB BYTE 
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COUNTER LATCH COMMAND FORMAT 


(Write to Control Word Register) ed | 


00 COUNTER 0 (OR 3) 
01 COUNTER 1 
10 COUNTER 2 


1 D A 
11 READ BACK COMMAND 290164-78 


READ BACK COMMAND FORMAT 


(Write to Control Word Register) 


O= LATCH COUNT O=- COUNTER NOT 
1 = DO NOT LATCH SELECTED 
COUNT : 1 - COUNTER IS 


| | SELECTED 
O= LATCH STATUS 
1 = DO NOT LATCH 


ae 290164-79 


STATUS FORMAT 


(Returned from Read Back Commana) 


D3 | D2 
re aT Das oe Se 


os 
PIN =0 


1 = OUTPUT O= COUNT AVAILABLE 
PIN = 1 FOR READING COUNTER 
1 = NULL COUNT MODE 


O= OUTPUT 


290164-80 
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6.0 WAIT STATE GENERATOR 


6.1 Functional Description 


The 82370 contains a programmable Wait State 
Generator which can generate a pre-programmed 
number of wait states during both CPU and DMA 
initiated bus cycles. This Wait State Generator is ca- 
pable of generating 1 to 16 wait states in non-pipe- 
_ lined mode, and 0 to 15 wait states in pipelined 
mode. Depending on the bus cycle type and the two 
Wait State Control inputs (WSC 0-1), a pre-pro- 
grammed number of wait states in the selected Wait 
State Register will be generated. 


The Wait State Generator can also be disabled to 
allow the use of devices capable of generating their 
own READY # signals. Figure 6-1 is a block diagram 
of the Wait State Generator. 


6.2 Interface Signals 


The following describes the interface signals which 
affect the operation of the Wait State Generator. 
The READY #, WSCO and WSC1 signals are inputs. 
READYO# is the ready output signal to the host 
processor. 


6.2.1 READY # 


READY # is an active LOW input signal which indi- 
cates to the 82370 the completion of a bus cycle. In 
the Master mode (e.g. 82370 initiated DMA transfer), 
this signal is monitored to determine whether a pe- 
ripheral or memory needs wait states inserted in the 
current bus cycle. In the Slave mode, it is used (to- 
. gether with the ADS# signal) to trace CPU bus cy- 
cles to determine if the current cycle is pipelined. 


~ INTERNAL WAIT STATE 
REQUIREMENT 


D7 


REGISTER 
SELECT 
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* 6.2.2 READYO# 


READYO# (Ready Out#) is an active LOW output 
signal and is the output of the Wait State Generator. 
The number of wait states generated depends on 
the WSC(0-1) inputs. Note that special cases are 
handled for access to the 82370 internal registers 
and for the Refresh cycles. For 82370 internal regis- 
ter access, READYO# will be delayed to take into 
the command recovery time of the register. One or 
more wait states will be generated in a pipelined cy- 
cle. During refresh, the number of wait states will be 
determined by the preprogrammed value in the Re- 
fresh Wait State Register. 


In the simplest configuration, READYO#-. can be 
connected to the READY # input of the 82370 and 
the 80376 CPU. This is, however, not always the 
case. If external circuitry is to control the READY # 
inputs as well, additional logic will be required (see 
Application Issues). 


6.2.3 WSC(0-1) 


These two Wait State Control inputs, together with 
the M/iO# input, select one of the three pre-pro- 
grammed 8-bit Wait State Registers which deter-— 
mines the number of wait states to be generated. 
The most significant half of the three Wait State 
Registers corresponds to memory accesses, the 
least significant half to |1/O accesses. The combina- 
tion WSC(0-1) = 11 disables the Wait State Gener- 
ator. 


READYO# 


D4 D3 DO 


ee MEMORY 0 [/0 0 


MEMORY 2 ‘10 2 
(RESERVED) | REFRESH 


PROGRAMMABLE WAIT STATE 


Figure 6-1. Wait State Generator Block Diagram 


REGISTERS 


290164-81 
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6.3 Bus Function 


6.3.1 WAIT STATES IN NON-PIPELINED CYCLE 


The timing diagram of two typical non-pipelined cy- 
cles with 82370 generated wait states is shown in 
Figure 6-2. In this diagram, it is assumed that the 
internal registers of the 82370 are not addressed. 
During the first T2 state of each bus cycle, the Wait 
State Control and the M/IO# inputs are sampled to 
determine which Wait State Register (if any) is se- 


lected. If the WSC inputs are active (i.e. not both are - - 
driven HIGH), the pre-programmed number of wait — 


states corresponding to the selected Wait State 
Register will be requested. This is done by driving 
the READYO# output HIGH during the end of each 
T2 state. 


The WSC (0-1) inputs need only be valid during the 
very first T2 state of each non-pipelined cycle. As a 


general rule, the WSC inputs are sampled on the » 


rising edge of the next clock (82384 CLK) after the 
last state when ADS# (Address Status) is asserted. 


The number of wait states generated depends on 
the type of bus cycle, and the number of wait states 
requested. The various combinations are discussed 
below. a 


1. Access the 82370 internal registers: 2 to 5 wait 
states, depending upon the specific register ad- 
dressed. Some back-to-back sequences to the Inter- 
rupt Controller will require 7 wait states. 


AC1 = 23) | ___ 
M/lO# 
BLE#, BHE# 


WSC(0 =1) 


CEE ten 
aps# 
READY# XXXKKRKKRRRY 


READYO# | 
ONE WAIT STATE 


VILILLALS LS 


lgaene pene 


Figure 6-2. Wait States in Non-Pipelined Cycles 
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2. Interrupt Acknowledge to the 82370: 5 wait states. 


3. Refresh: As programmed in the Refresh Wait 
State Register (see Register Set Overview). Note 
that if WCS (0-1) = 11, READYO# will stay inac- 
tive. , ‘ \ = 


4. Other bus cycles: Depending on WCS (0-1) and 
M/lO # inputs, these inputs select a Wait State Reg- 
ister in which the number of wait states will be equal 
to the pre-programmed wait state count in the regis- 
ter plus 1. The Wait State Register selection is de- 
fined as follows (Table 6-1). 


Table 6-1. Wait State Register Selection 
WSC(0- 1) 


Register Selected | 


_ WAIT REG 0 (I/O half) 
| WAIT REG 1 (1/0 half) 
WAIT REG 2 (I/O half) 
WAIT REG 0 (MEM half) 
WAIT REG 1 (MEM haif) 
_| WAIT REG 2 (MEM half) 
_ Wait State Gen. Disabled 


The Wait State Control signals, WwsC (O-1), can be 
generated with the address decode and the Read/ 
Write control signals as shown in Figure 6-3. 


TWO WAIT STATES 
290164-82 
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Address Decode ————»> _ 4 
w/R# >| Logic WSC (0-1) 


290164-83 


Ae AT REST te RSE I een Rt NATE Sn 


Figure 6-3. WSC (0-1) Generation 


Note that during HALT and SHUTDOWN, the num- 
ber of wait states will depend on the WSC (0-1) 
inputs, which will select the memory half of one of 
the Wait State Registers (see CPU Reset and Shut- 
down Detect). 


6.3.2 WAIT STATES IN PIPELINED CYCLES 


The timing diagram of two typical pipelined cycles 
with 82370 generated wait states is shown in Figure 
6-4. Again, in this diagram, it is assumed that the 
82370 internal. registers are not addressed. As de- 
fined in the timing of the 80376 processor, the Ad- 
dress (A1-23), Byte Enable (BHE#, BLE#), and 
other control signals (M/lIO#, ADS#) are asserted 
_ one T-state earlier than in a non-pipelined cycie; i.e. 
they are asserted at T2P. Similar to the non-pipe- 
lined case, the Wait State Control (WSC) inputs are 
sampled in the middle of the state after the last state 
the ADS# signal is asserted. Therefore, the WSC 
inputs should be asserted during the T1P state of 
each pipelined cycle (which is one T-state earlier 
than in the non-pipelined cycle). 


CLK2 


CLK } 


A(1 = 23) 
M/lO# 
BLE#, BHE? | 
WSC(0 ~1) 
ADS# 
READY# | AXXXXXXXXXY 


READYO¥ 3 
ONE WAIT STATE 


ao 


Figure 6-4. Wait States in Pipelined Cycles 
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The number of wait states generated in a pipelined 
cycle is selected in a similar manner as in the non- 
pipelined case discussed in the previous section. 
The only difference here is that the actual number of 
wait states generated will be one less than that of 
the non-pipelined cycle. This is done automatically 
by the Wait State Generator. 


6.3.3 EXTENDING AND EARLY TERMINATING 
BUS CYCLE 


The 82370 allows external logic to either add wait 
states or cause early termination of a bus cycle by 
controlling the READY # input to the 82370 and the 
host processor. A possible configuration is shown in 
Figure 6-5. 


EXTERNAL. READY# 
(EARLY TERMINATION) 


EXTERNAL scm 


NOT READY 
| (CYCLE EXTENSION) 


290164-85 _ 


Figure 6-5. External ‘READY’ Control Logic 
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The EXT. RDY # (External Ready) signal of Figure 6- 
5 allows external devices to cause early termination 
of a bus cycle. When this signal is asserted LOW, 
the output of the circuit will also go LOW (even 

though the READYO# of the 82370 may still be 
_ HIGH). This output is fed to the READY # input of 
the 80376 and the 82370 to indicate the completion 
of the current bus cycle. 


Similarly, the EXT. NOT READY (External Not 
Ready) signal is used to delay the READY # input of 
the processor and the 82370. As long as this signal 
is driven HIGH, the output of the circuit will drive the 
READY # input HIGH. This will effectively extend the 
- duration of a bus cycle. However, it is important to 


CLK2 


M/log 
BLE#, BHE¢ 


ADS# 


READY# XXXXXXXXXXMA 


READYO¢ 


O WAIT STATES 


Pi pacane 
— 


AXXXXXXYY 
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note that if the two-level logic is not fast enough to 
satisfy the READY # setup time, the OR gate should 
be eliminated. Instead, the 82370 Wait State Gener- 
ator can be disabled by driving both WSC (0-1) 
HIGH. In this case, the addressed memory or |/O 
device should activate the external READY # input 
whenever it is ready to terminate the current bus 
cycle. 


Figures 6-6 and 6-7 show the timing relationships of 
the ready signals for the early termination and exten- 
sion of the bus cycles. Section 6-7, Application Is- 
sues, contains a detailed timing analysis of the ex- 
ternal circuit. 


TWO WAIT STATES | 
290164-86 


Figure 6-6. Early Termination of Bus Cycle By ‘READY #’ 


CLK2 J 
CLK 
A(1 — 23) 
M/lo# 
BLE#, BHE# 
ADS# 


“ READY# 
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Figure 6-7. Extending Bus Cycle by ‘READY #’ 
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Due to the following implications, it should be noted 
that early termination of bus cycles in which 82370 
internal registers are accessed is not recommended. 


1. Erroneous data may be read from or written into 
the addressed register. 


2. The 82370 must be allowed to recover either be- 
fore HLDA (Hold Acknowledge) is asserted or before 
another bus cycle into an 82370 internal register is 
initiated. | 


The recovery time, in clock periods, equals the re- 
maining wait states that were avoided plus 4. 


6.4 Register Set Overview 


Altogether, there are four 8-bit internal registers as- 

sociated with the Wait State Genertor. The port ad- 

dress map of these registers is shown below in Ta- 

ble 6-2. A detailed description of each follows. 
Table 6-2. Register Address Map 


Wait State Reg 0 (read/write) 


Wait State Reg 1 (read/write) 
Wait State Reg 2 (read/write) 
Ref. Wait State Reg (read/ write) 


WAIT STATE REGISTER 0, 1, 2 


These three 8-bit read/write registers are functional- 
ly identical. They are used to store the pre-pro- 
grammed wait state count. One half of each register 
contains the wait state count for |/O accesses while 
the other half contains the count for memory ac- 
cesses. The total number of wait states generated 
will depend on the type of bus cycle. For a non-pipe- 
lined cycle, the actual number of wait states request- 
ed is equal to the wait state count plus 1. For a 
pipelined cycle, the number of wait states will be 
equal to the wait state count in the selected register. 
Therefore, the Wait State Generator is capable of 
generating 1 to 16 wait states in non-pipelined 
mode, and 0 to 15 wait states in pipelined mode. 


Note that the minimum wait state count in each reg- 
ister is 0. This is equivalent to 0 wait states for a 
pipelined cycle and 1 wait state for a non-pipelined 
cycle. 


REFRESH WAIT STATE REGISTER 
Similar to the Wait State Registers discussed above, 


this 4-bit register is used to store the number of wait 
states to be generated during a DRAM refresh cycle. 
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Note that the Refresh Wait State Register is not se- 
lected by the WSC inputs. It will automatically be 
chosen whenever a DRAM refresh cycle occurs. If 
the Wait State Generator is disabled during the re- 
fresh cycle (WSC (0-1) = 11), READYO# will stay 
inactive and the Refresh Wait State Register is ig- 
nored. 


6.5 Programming 


Using the Wait State Generator is relatively straight- 
forward. No special programming sequence is re- 
quired. In order to ensure the expected number of 
wait states will be generated when a register is se- 
lected, the registers to be used must be pro- 
grammed after power-up by writing the appropriate 
wait state count into each register. Note that upon 
hardware reset, all Wait State Registers are initial- 
ized with the value FFH, giving the maximum num- 
ber of wait states possible. Also, each register can 
be read to check the wait state count previously 
stored in the register. 


6.6 Register Bit Definition 


WAIT STATE REGISTER 0, 1, 2 


Port Address 


72H 
73H 
74H 


Description | 


Wait State Register 0 (read/write) | 
Wait State Register 1 (read/write) 
Wait State Register 2 (read/write) 


STATE COUNT 


> MEMORY WAIT STATE COUNT 
290164-88 


REFRESH WAIT STATE REGISTER 


Port Address: 75H 


Pe TeT=[o[s[epalo 


(Read/Write) 


> REFRESH WAIT 
STATE COUNT 


MUST BE ZERO 
290164-89 
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6.7 Application Issues 


6.7.1 EXTERNAL ‘READY’ CONTROL LOGIC 


As mentioned in.section 6.3.3, wait state cycles gen- 
erated by the 82370 can be terminated early or ex- 
tended longer by means of additional external logic 
(see Figure 6-5). In order to ensure that the 
READY # input timing requirement of the 80376 and 
the 82370 is satisfied, special care must be taken 
when designing this external control logic. This sec- 
tion addresses the design requirements. _ 


A simplified block diagram of the external logic along 
with the READY # timing diagram is shown in Figure 
6-8. The purpose is to determine the maximum delay 


803576-16 |. 


READYO# 
READY# __ AXX KKK KX 


PHI1 + PH12 = 62.5ns — 
Maximum READYO# Valid Delay = 35 ns 
READY # Setup Time = 20 ns 


Hou wt 


when! > 


EXT. READY# EXT. NOT READY 
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time allowed in the external control logic in order to 
satisfy the READY # setup time. 


First, it will be assumed that the 80376 is running at 
16 MHz (i.e. CLK2 is 32 MHz). Therefore, one bus 
state (two CLK2 periods) will be equivalent to 
62.5 ns. According to the AC specifications of the 
82370, the maximum delay time for valid READYO# 
signal is 31 ns after the rising edge of CLK2 in the 
beginning of T2 (for non-pipelined cycle) or T2P (for 
pipelined cycle). Also, the minimum READY # setup 
time of the 80376 and the 82370 should be 19 ns 
before the rising edge of CLK2 at the beginning of 
the next bus state. This limits the total delay time for 
the external READY # control logic to be 12.5 ns 
(62.5—31-19) in order to meet the READY# setup 
timing requirement. _ 


XXX 


 290164-90 


Maximum Ready Control Logic Delay = A-B-C = 7.5ns | : 


Figure 6-8. ‘READY’ Timing Consideration 
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7.0 DRAM REFRESH CONTROLLER 


7.1 Functional Description 


The 82370 DRAM Refresh Controller consists of a 
24-bit Refresh Address Counter and Refresh Re- 
quest logic for DRAM refresh operations (see Figure 
7-1). TIMER 1 can be used as a trigger signal to the 
DRAM Refresh Request logic. The Refresh Bus Size 
can be programmed to be 8- or 16-bit wide. Depend- 
ing on the Refresh Bus Size, the Refresh Address 
Counter will be incremented with the appropriate val- 
ue after every refresh cycle. The internal logic of the 
82370 will give the Refresh operation the highest 
priority in the bus control arbitration process. Bus 
control is not released and re-requested if the 82370 
is already a bus master. 


7.2 Interface Signals 


7.2.1 TOUT 1/REF # 


The dual function output 
(TOUT1/REF #) can be programmed to generate 
DRAM Refresh signal. If this feature is enabled, the 
rising edge of TIMER 1 output (TOUT 1 #) will trigger 
the DRAM Refresh Request logic. After some delay 
for gaining access of the bus, the 82370 DRAM Con- 
troller will generate a DRAM Refresh signal by driv- 
ing REF# output LOW. This signal is cleared after 
the refresh cycle has taken place, or by a hardware 
reset. : 


TOUT1 DRAM 
INTERN 
( Ab) REFRESH 
CONTROLLER 


| EDGE | 
| DETECTOR 


0 


pin of TIMER 1°: 


, 24=-BIT 
ADDRESS 
f COUNTER 


2=TO=1 
4 MUX 


select 
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If the DRAM Refresh feature is disabled, the 
TOUT1/REF # output pin is simply the TIMER 1 out- 
put. Detailed information of how TIMER 1 operates 
is discussed in section 6—Programmable Interval 
Timer, and will not be repeated here. 


7.3 Bus Function 


7.3.1 ARBITRATION 


In order to ensure data integrity of the DRAMs, the 
82370 gives the DRAM Refresh signal the highest 
priority in the arbitration logic. It allows DRAM Re- 
fresh to interrupt DMA in progress in order to per- 
form the DRAM Refresh cycle. The DMA service will 
be resumed after the refresh is done. 


In case of a DRAM Refresh during a DMA process, 
the cascaded device will be requested to get off the 
bus. This is done by de-asserting the EDACK signal. 
Once DREQn goes inactive, the 82370 will perform 
the refresh operation. Note that the DMA controller 
does not completely relinquish the system bus dur- 
ing refresh. The Refresh Generator simply “steals” 
a bus cycle between DMA accesses. 


Figure 7-2 shows the timing diagram of a Refresh 
Cycle. Upon expiration of TIMER 1, the 82370 will try 
to take control of the system bus by asserting 
HOLD. As soon as the 82370 see HLDA go active, 
the DRAM Refresh Cycle will be carried out by acti- 
vating the REF # signal as well as the address and 
control signals on the system bus (Note that REF # 
will not be active until two CLK periods HLDA is as- 
serted). The address bus will contain the 24-bit ad- 


INTERNAL 
DMA 
HANDSHAKE 


DMA 

CONTROLLER 
ARBITRATION 
LOGIC 


TO DMA 
CONTROLLER 
(INTERNAL) 


24- BIT 
REFRESH 
ADDRESS 


TOUT1 /REF # 


REFRESH ENABLE (INTERNAL) 
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Figure 7-1. DRAM Refresh Controller 
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dress currently in the Refresh Address Counter. The 
control signals are driven the same way as ina 
Memory Read cycle. This “read” operation is com- 
plete when the READY # signal is driven LOW. 
Then, the 82370 will relinquish the bus by de-assert- 
ing HOLD. Typicaily, a Refresh Cycle without wait 
states will take five bus states to execute. If ‘‘n’” wait 
states are added, the Refresh Cycle will last for five 
plus “‘n’ bus states. 


How often the Refresh Generator will initiate a re- 
fresh cycle depends on the frequency of CLKIN as 
will as TIMER 1’s programmed mode of operation. 
For this specific application, TIMER 1 should be pro- 
grammed to operate in Mode 2 to generate a con- 
stant clock rate. See section 6—Programmable in- 
terval Timer for more information on programming 


the timer. One DRAM Refresh Cycle will be generat- _ 
ed each time TIMER 1 expires (when TOUT? charig- 


es from LOW to HIGH). 


The Wait State Generator can be used tc insert wait 
states during a refresh cycle. The 82370 will auto- 
matically insert the desired number of wait states as 
programmed in the Refresh Wait State meeelsl (see 
Wait State Generator). 


Tx Ti 


CLK | a 
HOLD pm 
HLDA ae 
A(1-23),M/lO# 


W/R#, BHE# 


REF# 


ADS# RXRKKRRRNRKKEK) 


BLE#,D/C¥ MXKXXKK ma avai B oad 
Wik 


READY# KKK KK RK KK RRR KK RRR KKK RRA, 
| : 


A Sa enALN ARANERORDT HL Nte St NT NLR RR ORR ND SC: isn aan 


Figure 7-2. 82370 Refresh Cycle 
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7.4 Modes of Operation 


7.4.1 WORD SIZE AND REFRESH ADDRESS 
COUNTER | 


The 82370 supports 8- and 16-bit refresh cycle. The 
bus width during a refresh cycle is programmable 
(see Programming). The bus size can be pro- 
grammed via the Refresh Control Register (see Reg- 
ister Overview). If the DRAM bus size is 8- or 16-bits, 
the Refresh Address Counter will be incremented by 
1 or 2, respectively. 


The Refresh Address Counter is cleared by a hard- 
ware reset. 


7.5 Register Set Overview 


The Refresh Generator has two interna! registers to 
control its operation. They are the Refresh Control 
Register and the Refresh Wait State Register. Their 
port address map is shown in Table 7-1 below. 


| LOOX 
: i 


290164-92 
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Table 7-1. Register Address Map 


1CH Refresh Control Reg. (read/write) 

75H Ref. Wait State Reg. (read/write) 
| The Refresh Wait State Register is not part of the 
Refresh Generator. It is only used to program the 
number of wait states to be inserted during a refresh 
cycle. This register is discussed in detailed in section 


7 (Wait State Generator) and will not be repeated 
here. 


REFRESH CONTROL REGISTER 


This 2-bit register serves two functions. First, it is 
used to enable/disable the DRAM Refresh function 
output. If disabled, the output of TIMER 1 is simply 
used as a general purpose timer. The second func- 
tion of this register is to program the DRAM bus size 
for the refresh operation. The programmed bus size 
also determines how the Refresh Address Counter 
will be incremented after each refresh operation. 


7.6 Programming 


Upon hardware reset, the DRAM Refresh function is 
disabled (the Refresh Control Register is cleared). 
The following programming steps are needed before 
the Refresh Generator can be used. Since the rate 
of refresh cycles depends on how TIMER 1 is pro- 
grammed, this timer must be initialized with the de- 
sired mode of operation as well as the correct 
refresh interval (see Programming Interval Timer). 
Whether or not wait states are to be generated dur- 
ing a refresh cycle, the Refresh Wait State Register 
must also be programmed with the appropriate val- 
ue. Then, the DRAM Refresh feature must be en- 
abled and the DRAM bus width should be defined. 
These can be done in one step by writing the appro- 
priate control word into the Refresh Control Register 


fee ee) ed 


MUST BE ZERO 


82370 


(see Register Bit Definition). After these steps are 
done, the refresh operation will automatically be in- 
voked by the Refresh Generator upon expiration of 
Timer 1. 


In addition to the above programming steps, it 
should be noted that after reset, although the 
TOUT1/REF# becomes the Time 1 output, the 
state of this pin in undefined. This is because the 
Timer module has not been initialized yet. Therefore, 
if this output is used as a DRAM Refresh signal, this 
pin should be disqualified by external logic until the 
Refresh function is enabled. One simple solution is 
to logically AND this output with HLDA, since HLDA 
should not be active after reset. "6 


7.7 Register Bit Definition 
REFRESH CONTROL REGISTER 


Port Address: 1CH (Read/Write) 


8.0 RELOCATION REGISTER, 
ADDRESS DECODE, AND 
CHIP-SELECT (CHPSEL #) 


8.1 Relocation Register 


All the integrated peripheral devices in the 82370 
are controlled by a set of internal registers. These 
registers span a total of 256 consecutive address 
locations (although not all the 256 locations are 
used). The 82370 provides a Relocation Register 
which allows the user to map this set of internal reg- 
isters into either the memory or I/O address space. 
The function of the Relocation Register is to define 
the base address of the internal register set of the 
82370 as well as if the registers are to be memory- 
or |1/O-mapped. The format of the Relocation Regis- 
ter is depicted in Figure 8-1. 


00 REF. DISABLED 
01 INTEL RESERVED 
10 BUS SIZE = 16 


11 BUS SIZE =8 
290164-93 
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Port Address: 7FH (Read/Write) 


Seerencinmmanat cumsietan Retrana. Nn Ad Me: 


Note that the Relocation Register is part of the inter- 
nal register set of the 82370. It has a port address of 


7FH. Therefore, any time the content of the Reloca- . 


tion Register is changed, the physical location of this 
register will also be moved. Upon reset of the 82370, 
the content of the Relocation Register will be 
cleared. This implies that the 82370 will respond to 
its |/O addresses in the range of DN000H to O00FFH. 


8.1.1 1/O-MAPPED 82370 


As shown in the figure, Bit O of the Relocation Regis- 


ter determines whether the 82370 registers are to be. 


memory-mapped or |/O mapped. When Bit 0 is set 
to ‘0’, the 82370 will respond to !/O Addresses. Ad- 
dress signals BHE#, BLE#, A1—A7 will be used to 
select one of the internal registers to be accessed. 
Bit 1 to Bit 7 of the Relocation Register will corre- 
spond to AQ to A15 of the Address bus, respectively. 
Together with A8 implied to be ‘0’, A15 to A8 will be 
fully decoded by the 82370. The following shows 


how the 82370 is mapped into the I/O address. 


space. 


Example 


~~ 


Relocation Register = 11001110 (OCEH) 


82370 will respond to !/O address range from 
OCEOOH to OCEFFH. 


Therefore, this |1/O mapping mechanism allows the 
82370 internal registers to be located on any even, 


contiguous, 256 byte boundary of the system I/O 


space. 


8.1.2 MEMORY-MAPPED 82370 


When Bit 0 of the Relocation Register is set to ‘1’, 
the 82370 will respond to memory addresses. Again, 


D7 D6 DS D4 D3 D2 DI ODO 
A23/)A22/ A21/]A20/1A19/]A18/]A17/1M/lO# 1 
PA15 | A14 [A13/{A12/[A11/] A10 {| AQ | 


FOR 1/0 MAPPED: A15—AQ 
FOR MEMORY MAPPED: A23=A16 


Figure 8-1. Relocation Register 


_ Example 


82370 


0-1/0 MAPPED 
1 = MEMORY 


wenn 290164--94 


Address signals BHE#, BLE#, A1—A7 will be used 
to select one of the internal registers to be ac- 
cessed. Bit 1 to Bit 7 of the Relocation Register will 
correspond to A17~A23, respectively. A16 is as- 
sumed to be ‘0’, and A8—A15 are ignored. Consider 
the following example. — | 


Relocation Register = 10100111 (O0A7H) 
The 82370 will respond to memory addresses in 
the range of AGXX00H to AGOXXFFH (where ‘X’ is 


don’t care). 


This scheme implies that the internal registers can 
be located in any even, contiguous, 2**16 byte page 
of the memory space. 


8.2 Address Decoding 


As mentioned previously, the 82370 internal regis- 
ters do not occupy the entire contiguous 256 ad- 
dress locations. Some of the locations are ‘unoccu- 
pied’. The 82370 always decodes the lower 8 ad- 
dress signals (BHE #, BLE #, A1-—A7) to determine if 
any one of its registers is being accessed. If the ad- 
dress does not correspond to any of its registers, the 
82370 will not respond. This allows external devices 
to be located within the ‘holes’ in the 82370 address 
space. Note that there are several unused address- 
es reserved for future Intel peripheral devices. 


8.3 Chip-Select (CHPSEL #) 


The Chip-Select signal (CHPSEL#) will go active 
when the 82370 is addressed in a Slave bus 
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Figure 8-2. CHPSEL # Timing 


cycle (either read or write), or in an interrupt ac- 
knowledge cycle in which the 82370 will drive the 
Data Bus. For a given bus cycle, CHPSEL# be- 
comes active and valid in the first T2 (in a non-pipe- 
lined cycle) or in TiP (in a pipelined cycle). It will 
stay valid until the cycle is terminated by READY # 
driven active. As CHPSEL# becornes valid well be- 
fore the 82370 drives the Data Bus, it can be used to 
control the transceivers that connect the local CPU 
bus to the system bus. The timing diagram of 
rl is shown in Figure 8-2. 


9.0 CPU RESET AND SHUTDOWN 
DETECT 


The 82370 will activate the CPURST nal to reset 
the host processor when one of the following condi- 
tions occurs: 


— 82370 RESET is active; 


— 82370 detects a 80376 Shutdown cycle (this fea- 
ture can be disabled); 


— CPURST software command is issued to 80376. 
Whenever the CPURST signal is activated, the 


82370 will reset its own internal Slave- Bus state ma- 
chine. 


9.1 Hardware Reset | 


Following a hardware reset, the 82370 will assert its 


CPURST output to reset the host processor. This 


output will stay active for as long as the RESET input 
is active. During a hardware reset, the 82370 internal 
registers will be initialized as defined in the corre- 
sponding functional descriptions. 


82370 
ACCESSED = 2 WAIT STATES 
T2 T2 TZ 
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9.2 Software Reset 


CPURST can be generated by writing the following 
bit pattern into 82370 register location 64H. 


D7 Se ote DO 
1144XXXO0 


The Write operation into this port is considered as 


an 82370 access and the internal Wait State Gener- 
ator will automatically determine the required num- 
ber of wait states. The CPURST will be active follow- 
ing the compietion of the Write cycle to this port. 
This signal will last for 62 CLK2 periods. The 82370 
should not be accessed until the CPURST is deacti- 
vated. 


This internal port is Write-Only and the 82370 will 
not respond to a Read operation to this location. 
Also, during a software reset command, the 82370 
will reset its Slave-Bus state machine. However, its 
internal registers remain unchanged. This allows the 
operating system to distinguish a ‘warm’ reset by 
reading any 82370 internal register previously pro- 
grammed for a non-default value. The Diagnostic 
registers can be used for this purpose (see Internal 
Control and Diagnostic Ports). 


9.3 Shutdown Detect 


The 82370 is constantly monitoring the Bus Cycle 
Definition signais (M/IO#, D/C#, W/R#) and is 
able to detect when the 80376 is in a Shutdown bus 
cycle. Upon detection of a processor shutdown, the 
82370 wil! activate the CPURST output for 62 CLK2 
periods to reset the host processor. This signal is 
generated after the Shutdown cycle is terminated by 
the READY # signal. 


51411 


> Port Address: 61H 


‘intel 


Although the 82370 Wait State Generator will not 
automatically respond to a Shutdown (or Halt) cycle, 
the Wait State Control inputs (WSCO, WSC1) can be 
used to determine the number of wait states in the 
same manner as other non-82370 bus cycles. 


Somer 


This Shutdown Detect feature can be enabled or dis- 
abled by writing a control bit in the Internal Control 
Port at address 61H (see Internal Control and Diag- 
nostic Ports). This feature is disabled upon a hard- 
ware reset of the 82370. As in the case of Software 
Reset, the 82370 will reset its Slave-Bus state ma- 
chine but will not change any of its internal register 
contents. 


10.0 INTERNAL CONTROL AND 
DIAGNOSTIC PORTS 


10.1 Internal Control Port 


The format of the Internal Control Port of the 82370 | 
is. shown.in Figure 10-1. This Control Port is used to © 


enable/disable the Processor Shutdown Detect 
mechanism as well as controlling the Gate inputs of 
the Timer 2 and 3. Note that this is a Write-Only port. 
Therefore, the 82370 will not respond to a read op- 
eration to this port. Upon hardware reset, this port 
will be cleared; i.e., the Shutdown Detect feature 
and the Gate inputs of Timer 2 and 3 are disabled. 


(Write only) 


| COUNTER 3 
GATE 
INPUT 


1 SHUTDOWN 
| ENABLE/ 
| DISABLE | 


SHUTDOWN COUNTER 3 
DETECT . GATE 

O= DISABLE O=- DISABLE 

— t= ENABLE 1 ENABLE 


D6 e D4 03 D2 01 DO 


COUNTER 2 
GATE 
_INPUT 
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10.2 Diagnostic Ports 

Two 8-bit read/write Diagnostic Ports are provided 
in the 82370. These are two storage registers and 
have no effect on the operation of the 82370. They 
can be used to store checkpoint data or error codes 
in the power-on sequence and in the diagnostic 
service routines. As mentioned in the CPU RESET 
AND SHUTDOWN DETECT section, these Diagnos- 
tic Ports can be used to distinguish between ‘cold’ 
and ‘warm’ reset. Upon hardware reset, both Diag- 
nostic Ports are cleared.. The address map of these - 
Diagnostic Ports is shown in Figure 10-2. 


Diagnostic Port 1 
Diagnostic Port 2 


(Read/Write) 
(Read/Write) 


Figure 10-2. Address Map of Diagnostic Ports 


11.0 INTEL RESERVED I/O PORTS 


There are nineteen I/O ports in the 82370 address 
space which are reserved for Intel future Peripheral 
device use only. Their address locations are: 10H, 
12H, 14H, 16H, 2AH, 3DH, 3EH, 45H, 46H, 76H, 
77H, 7DH, 7EH, CCH, CDH, DOH, D2H, D4H, and 
D6H. These addresses should not be used in the 
system since the 82370 will respond to read/write 
operations to these locations and bus contention 
may occur if any peripheral is fees to the same 


address location. 


COUNTER 2 

NOT USED GATE 
O= DISABLE 
1- ENABLE - 
| » 290164-96 » 


Figure 10-1. Internal Control Port 
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12.0 PACKAGE THERMAL 
SPECIFICATIONS 


The intel 82370 Integrated System Peripheral is 
specified for operation when case temperature is 
within the range of O°C to 78°C for the ceramic 
132-pin PGA package, and 68°C for the 100-pin 
plastic package. The case temperature may be mea- 
sured in any environment, to determine whether the 
82370 is within specified operating range. The case 
temperature should be measured at the center of 
the top surface opposite the pins. 


The ambient temperature is guaranteed as long as 
Tc; is not violated. The ambient temperature can be 


neyo teen ans Hest REN 


emer tare 


calculated from the 6j, and 4j, from the following 
equations: 


Ty = Te + P*Gic 
Ta = Tj _ P*6ig 
Tc = Ta + P*[6ia oo 6ic] 


Values for 6jg and 6), are given in Table 12.1 for the 
100-lead fine pitch. 9jq is given at various airflows. 
Table 12.2 shows the maximum Ta, allowable (with- 
out exceeding T,) at various airflows. Note that Tg 
can be improved further by attaching “fins” or a 
“heat sink” to the package. P is calculated using the 
maximum Mot lec. 


Table 12.1 82370 Package Thermal Characteristics 
Thermal Resistances (°C/Watt) 0j- and Oja 


Package Dic 


3 3 


6ja Versus Airflow-ft3/min (m3/sec) 
0; 200 


400 ; 600 | 800 ; 1000 


| (0) (1.01) oa (3. 04) | (4. 06) (5.07) 


100L Fine Pitch} 7 | 33] 27 
2] 21] 17 


132L PGA 


Table 12.2 82370 Maximum Allowable Ambient 
Temperature at Various Airflows 


Package 


100L PQFP Pkg: 

To = Ta + P*(8jq — 8c) 
“To = 63 + 1.21(33 — 7) 

To = 63 + 1.21(26) 
Ty = 63 + 31.46 
Te = 94°C 


3 3 


Ta(c) Versus Airflow-ft3/min (m3/sec) | 


0 | 200 
‘| (0) | (1.01) = — 7 — (4.06) | (5.07) 


100L Fine Pitch ate ae 
132LPGA | 2 | 74, 


432L PGA Pkg: 

To = Ta + P*(Big — 8jc) 
To = 74 +4.21(21 - 2) 
To = 74 + 1.21(19) 

To = 74 + 22.99 

To = 96°C 


400 | 600 | 800 | 1000 


cacace 
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13.0 ELECTRICAL SPECIFICATIONS 


82370 D.C. Specifications Functional Operating Range: 
Voc = 5.0V +10%; Tcase = 0°C to 96°C for 132-pin PGA, 0°C to 94°C for 100-pin plastic 


IL . 
OL 


(Note 1) 


(Note 1) 


V Output Low Voltage 
1 lo, = 4mA: 0 V 
0 . v 
pA 


lol. =SmA: |. | 
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[ow = =0.2mA [Arg-A1,Dig-Do, BHE*.BLE® | Voo-08[ 
Ton= =08mA [Alors Sie | 
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45 
A;-23, Do-15, BHE#, BLE# 

45 

15 


Input Leakage Current 5 
All Inputs Except: | : 
IRQ11#-IRQ23# 
EOP #, TOUT2/IRQ3 # . 
DREQ4/IRQ9# 
lid Input Leakage Current 10 
Inputs: 
IRQ11#-IRQ23# 
EOP #, TOUT2/IRQ3 
. DREQ4/IRQ9 


H 
| 
Output Leakage Current 
Supply Current (CLK2 = 32 MHz) 
Input Capacitance . 
Cok CLK2 Input Capacitance. | 


NOTES: 

1. Minimum value is not 100% tested. 

2. fo = 1 MHz; sampled only. 

3. These pins have weak internal pullups. They sould not be left floating. 

4. loc is specified with inputs driven to CMQS levels, and outputs driving CMOS loads. Icc may be higher if inputs are driven 
to TTL levels, or if outputs are driving TTL loads. | 

5. Tested at the minimum operating frequency of the part. 
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LEGEND: 

A—Maximum output delay specification 
B—Minimum output delay specification 
C—Minimum input setup specification 
D—Minimum input hold specification 


Figure 13-1. Drive Levels and Measurement Points for A.C. Specification 


\ 


82370 A.C. Specifications These A.C. timings are tested at 1.5V thresholds, except as noted. 
Functional Operating Range: Vcc = 5.0V +10%; Tcase = 0°C to 96°C for 132-pin PGA, 0°C to 94°C for 
100-pin plastic —. | 


| Parameter Description | Min 
CLK2 Period | | 


‘CLK2 High Time 


CLK2 High Time At Voc — 0.8V 
CLK2 Low Time At 2.0V 

CLK2 Low Time At 0.8V 

CLK2 Fall Time Voc — 0.8V to 0.8V 


CLK2 Rise Time 


A1-A23, BHE#, BLE# 
EDACKO-—EDACK2 Valid Delay 
A1-A23, BHE#, BLE# 
EDACKO-EDACK3 Float Delay 


t8 | A1-A23, BHE#, BLE# Setup Time 6 
9 | A1-A23, BHE#, BLE # Hold Time 4 
’ ’ ; 4 


0.8V to Voc — 0.8V 
CL = 120 pF 


(Note 1) 


W/R#, M/IO#, D/C# Valid Delay 33 C. = 75 pF 
W/R#, M/IO#, D/C# Float Delay 4 35 (Note 1) 
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82370 A.C. Specifications These A.C. timings are tested at 1.5V thresholds, except as noted. 
Functional Operating Range: Vcc = 5.0V +10%; Tcase = 0°C to 96°C for 132-pin PGA, 0°C to 94°C for 
100-pin plastic (Continued) | | 


Parameter Description 


t12 W/R#, M/IO#, D/C# Setup Time 
t13 | W/R#,M/IO#, D/C# Hold Time 


— «t14 ADS # Valid Delay 
t15 ADS # Float Delay 
ADS # Setup Time 
ADS # Hold Time 


or ob 
a 
. 


CL = 120pF | 
(Note 1) 


GQ & ao ff 
O1 © or OD 


18 Slave Mode DO-—D15 Read Valid 
t19 : Slave Mode DO—D15 Read Float 


CL = 120 pF 
(Note 1) 


t20 Slave Mode DO-—D15 Write Setup | 
t21 Slave Mode DO-—D15 Write Hold 
t22 Master Mode DO-D15 Write Valid 
| t23 Master Mode DO-D15 Write Float 
t24 Master Mode DO-—D15 Read Setup 
t25 Master Mode DO-D15 Read Hold 
t26 READY # Setup Time 
t27 READY # Hold Time 
| WSCO-WSC1 Setup Time 
} WSCO0O-WSC1 Hold Time 


aslo 


O|/AOG|@OolrAA 


35 
aD D 


nape 


t30 RESET Setup Time 
31 RESET Hold Time _ 
READYO # Valid Delay 


CPURST Valid Delay (Falling Edge Only) 


134 HOLD Valid Delay 


| 31 C_ = 25 pF 
Cy = 50 pF 


CL = 100 pF 


18 
33 


35 HLDA Setup Time | | I 

{36 HLDA Hold Time 4 6 
37a EOP # Setup (Synchronous) _ 
138a EOP #. Hold (Synchronous) 
87b EOP # Setup (Asynchronous) 
138b EOP # Hold (Asynchronous) | 


EOP # Valid Delay (Falling Edge Only) 


t40 a; EOP # Float Delay 


t41a DREQ Setup (Synchronous) 
t42a DREQ Hold (Synchronous) 


t41b | DREQ Setup (Asynchronous) 
t42b DREQ Hold (Asynchronous) 


t43 INT Valid Delay from IRQn 


144 NA# Setup Time 5 
t45 NA# Hold Time 15 


G 
oO 


| OC. = 100 pF 
(Note 1) 


‘Ss 
© 


—_hk ok 
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82370 A.C. Specifications These A.C. timings are tested at 1.5V thresholds, except as noted. 
Functional Operating Range: Voc = 5.0V £10%; Tcoase = 0°C to 96°C for 132-pin PGA, 0°C to 94°C for 
100-pin plastic (Continued) | 


CLKIN Frequency 

CLKIN High Time 2.0V 

CLKIN Low Time 0.8V 

CLKIN Rise Time 0.8V to 3.7V 


CLKIN Fall Time 3.7V to 0.8V 


TOUT1 #/REF # Valid Delay | 
from CLK2 (Refresh) CL = 120 pF 
from CLKIN (Timer) : CL = 120 pF 


TOUT2 # Valid Delay 
(from CLKIN, Falling Edge Only) 
TOUT2# Float Delay 


t55 TOUT3# Valid Delay 
56 


(from CLKIN) 


CHPSEL # Valid Delay 


NOTE: . 
1. Float condition occurs when the maximum output current becomes less than I_o in magnitude. Float delay is not tested. 
For testing purposes, the float condition occurs when the dynamic output driven voltage changes with current loads. 


t 


82370 


OUTPUT 


. 


C,, indicates all parasitic capacitances. 


290164-98 


290164-—99 


Figure 13-2. A.C. Test Load Figure 13-3 
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INPUT SET= UP AND HOLD TIMING (CONT.) 


CLK2 


~ WSC(0= 1) 


PHI 1 PHi2 | ~~ PHI1 | PHI 2 PHIM | PHI 2 


CLK2 


A(1—A23), BHE¢, BLE}f <eenemmncencemmam : ) = 
W/ Ri, M/lOf, D/C ESS a 


126 127 
READY# == 


ADS 
HLDA 
D(O = 15) (DMA Read) 


D(O = 15) (CPU Write) 


Valle leh 
CUE 


EOP4 ——= 


290164-—A0 


Figure 13-4. Input Setup and Hold Timing 


5-1418 


Hold | Setup 


CLK2 | 
T33 MIN. 
CPURST i 


le T33 MAX. 
290164-A1 


Figure 13-5. Reset Timing 


CLK2 


A1-23, BHE#, BLE? - 


A1 - 23, BHE#, BLE¥ 
EDACK(O = 2) <u 


A1 = 23, BHE#, BLE¢ 


dieses BLN 


TS6Max 


290164-A2 


Figure 13-6. Address Output Delays 
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Tx - Tx Tx 
PHI1 PHI 2 PHI 1 PHI 2 PHI 1 | PHI 2 


DO 15) (GU READ) Het tt sy 


(LLf 
7 | T18Max an] | T19Max 
D(O-15) (DMA WRITE) il == —— 


O04 
| T22Max 


fran 
ROX 


we @ 


| Nene 
T22Max 


D(0-15) (DMA WRITE) 


D(0-15) (DMA WRITE) | SO 


1 T23Max 
290164-A3 


Figure 13-7. Data Bus Output Delays 


Tx T™ Tx. 
PHI 1 PHI2 PHI 1 


W/R#, M/0$,D/CH Za | XXX 
_ | eager ead T10Max 


W/R#, M/IOH,D/CH | TD 00.0.0 ces 
| | _ gees cal erased [ee 


oo 


XXX 


7 = be 
READYO# j eee’ 


ee 


a= 
EOP# | | XXX 
a T39Max 
EOP# | T40Min 
: _M 
‘ T40Max 


REF # | 1 XXX) | 
| peal bees Pore 


CLK2 


W/R#, M/l0#,D/C# 


290164-A4 


Figure 13-8. Control Output Delays 
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TOUT2# 


TOUT2# 


TOUT34 


T55Max 
290164-A5 


Figure 13-9. Timer Output Delays 
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— . APPENDIX A 
PORTS LISTED BY ADDRESS 


Read/Write DMA Channel 0 Target Address, AO—A15 
Read/Write DMA Channel 0 Byte Count, BO-B15 
Read/Write DMA Channel 1 Target Address, AO—A15 
Read/Write DMA Channel 1 Byte Count, BO-B15 
Read/Write DMA Channel 2 Target Address, AO-A15 | 
Read/Write DMA Channel 2 Byte Count, BO-B15 
Read/Write DMA Channel 3 Target Address, AO—A15 
Read/Write DMA Channel 3 Byte Count, BO-B15 . 
Read/Write DMA Channel 0-3 Status/Command | Register 
Read/Write DMA Channel 0-3 Software Request Register 
Write DMA Channel 0-3 Set-Reset Mask Register 
Write DMA Channel 0-3 Mode Register | 

Write Clear Byte-Pointer FF 

Write DMA Master-Clear 

Write DMA Channel 0-3 Clear Mask Register 
Read/Write DMA Channel 0-3 Mask Register 

Intel Reserved 

Read/Write DMA Channel 0 Byte Count, B16- B23 
Intel Reserved 

Read/Write DMA Channel 1 eye Count, B16- B23 
Intel Reserved | 

Read/Write DMA Channel 2 Byte Count, B16-B23 
Intel Reserved | 

Read/Write DMA Channel 3 Byte aunt B16-B23 
Write DMA Channel 0-3 Bus Size Register 
Read/Write DMA Channel 0-3 Chaining Register 
Write DMA Channel 0-3 Command Register II 

Write DMA Channel 0-3 Mode Register II 
Read/Write Refresh Control Register 

Reset Software Request Interrupt 

Write Bank B ICW1, OCW2 or OCW3 | 7 
Read Bank B Poll, Interrupt ds or In-Service 
Status Register 

Write Bank B ICW2, ICWS, |ICW4 or OCW1 
Read Bank B Interrupt Mask Register 

Read Bank B iCW2 

Read/Write IRQ8 Vector Register 

Read/Write |RQ9 Vector Register 

Reserved 
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Description 


Read/Write IRQ11 Vector Register 

Read/Write IRQ12 Vector Register 

Read/Write IRQ13 Vector Register 

Read/Write |RQ14 Vector Register 

Read/Write IRQ15 Vector Register 

Write Bank A ICW1, OCW2 or OCW3 

Read Bank A Poll, Interrupt Request or In-Service 
Status Register 

Write Bank A ICW2, ICW3, ICW4 or OCW1 

Read Bank A Interrupt Mask Register 

Read Bank A ICW2 

Read/Write IRQO Vector Register 

Read/Write IRQ1 Vector Register 

Read/Write IRQ1.5 Vector Register 

Read/Write IRQ3 Vector Register 

Read/Write |IRQ4 Vector Register 

Reserved 

Reserved 

Read/Write IRQ7 Vector Register 

Read/Write Counter 0 Register 

Read/Write Counter 1 Register 

Read/Write Counter 2 Register 

Write Control Word Register |—Counter 0, 1, 2 
Read/Write Counter 3 Register | 
Reserved 

Reserved 

Write Word Register !I—Counter 3 

-_ Write Internal Control Port 

Write CPU Reset Register (Data—1 111XXX0H) 

_ Read/Write Wait State Register 0 

_ Read/Write Wait State Register 1 

Read/Write Wait State Register 2 

Read/Write Refresh Wait State Register 

Reserved 

Reserved 

Reserved 

Reserved 

Read/Write Relocation Register | 

Read/Write Internal Diagnostic Port 0 

Read/Write DMA Channel 2 Target Address, A16—A23 
Read/Write DMA Channel 3 Target Address, A16—A23 
Read/Write DMA Channel 1 Target Address, A16—A23 
Read/Write DMA Channel 0 Target Address, A16—A23 
Read/Write Internal Diagnostic Port 1 

Read/Write DMA Channel 6 Target Address, A16—A23 
Read/Write DMA Channel 7 Target Address, A16—A23 
Read/Write DMA Channel 5 Target Address, A16-—A23 
Read/Write DMA Channel 4 Target Address, A16—A23 
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Description 


Read/Write DMA Channel 0 Requester Address, AO-A15 
Read/Write DMA Channel 0 Requester Address, A16-A23 
Read/Write DMA Channel 1 Requester Address, AO-A15 
Read/Write DMA Channel 1 Requester Address, A16—A23 
Read/Write DMA Channel 2 Requester Address, AO—A15 
Read/Write DMA Channel 2 Requester Address, A16-—A23 
Read/Write DMA Channel 3 Requester Address, AO--A15 
Read/Write DMA Channel 3 Requester Address, A16-A23 
Read/Write DMA Channel 4 Requester Address, AO-A15 

‘ Read/Write DMA Channel 4 Requester Address, A16-A23 
Read/Write DMA Channel 5 Requester Address, AO-A15 
Read/Write DMA Channel 5 Requester Address, A16—A23 
Read/Write DMA Channel 6 Requester Address, AO—-A15 
Read/Write DMA Channel 6 Requester Address, A1i6-A23 
~ Read/Write DMA Channel 7 Requester Address, AO—A15 
Read/Write DMA Channel 7 Requester Address, A16-A23 
Write Bank C ICW1, OCW2 or OCW3 

Read Bank C Poll, Interrupt Request or In-Service - 

Status Register . — 

Write Bank C ICW2, ICWS, ICW4 or OCW1 

Read Bank C Interrupt Mask Register 

Read Bank C ICW2 

Read/Write |RQ16 Vector Register 

Read/Write IRQ17 Vector Register 

Read/Write IRQ18 Vector Register 

Read/Write IRQ19 Vector Register 

Read/Write IRQ20 Vector Register 

Read/Write |RQ21 Vector Register 

' Read/Write IRQ22 Vector Register 

Read/Write IRQ23 Vector Register 

Read/Write DMA Channel 4 Target Address, AO-A15 

~ Read/Write DMA Channel 4 Byte Count, BO-B15. 

_ Read/Write DMA Channel 5 Target Address, AO-A15 
Read/Write DMA Channel 5 Byte Count, BO-B15 
Read/Write DMA Channel 6 Target Address, AQ—A15 
Read/Write DMA Channel 6 Byte Count, BO~-B15 
~ Read/Write DMA Channel 7 Target Address, AO-A15 
Read/Write DMA Channel 7 Byte Count, BO-B15 
Read DMA Channel 4-7 Status/Command | Register 
Read/Write DMA Channel 4-7 Software Request Register 
Write DMA Channel 4—7 Set-Reset Mask Register 
Write DMA Channel 4- ’ Mode Register | 
Reserved 
Reserved 
Write DMA Channel 4-7 Clear Mask Register 
Read/Write DMA Channel 4~7 Mask Register 
intel Reserved 
Read/Write DMA Channel 4 Byte Count, B1i6-B23 
Intel Reserved | 
Read/Write DMA Channel 5 Byte Count, B16~-B23 
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Port Address 
(HEX) 


D4 
D5 
D6 
D7 
D8 
D9 
DA 
DB 


82370 


Description 


Intel Reserved 

Read/Write DMA Channel 6 Byte Count, B16-—B23 
Intel Reserved 

Read/Write DMA Channel 7 Byte Count, B16-B23 
Write DMA Channel 4-7 Bus Size Register 
Read/Write DMA Channel 4-7 Chaining Register 
Write DMA Channel 4-7 Command Register II 
Write DMA Channel 4-7 Mode Register II 


5-1425 


Port Address _ ~ ieee 


82370 


_ APPENDIX B 


PORTS LISTED BY FUNCTION 


(HEX) 


DMA CONTROLLER 
0D 


0C 


08 
C8 
1A 
DA 


OB 
CB 
1B 
DB 


CATERER CERT NESTE PERN 


Description 2 


Write DMA Master-Clear 


Write DMA Clear Byte-Pointer FF 


Read/Write DMA Channel 0-3 Status/Command | Register 
Read/Write DMA Channel 4-7 Status/Command | Register 
Write DMA Channel 0-3 Command Register II 
Write DMA Channel 4-7 Command Register II 


Write DMA Channel 0-3 Mode Register | 
Write DMA Channel 4-7 Mode Register | 
Write DMA Channel 0-3 Mode Register II 
Write DMA Channel 4—7 Mode Register II 


Read/Write DMA Channel 0-3 Software Request Register 
Read/Write DMA Channel 4-7 Software Request Register 
Reset Software Request Interrupt 


_ Write DMA Channel 0-3 Clear Mask Register 


Write DMA Channel 4-7 Clear Mask Register 
Read/Write DMA Channei 0-3 Mask Register 
Read/Write DMA Channel 4—7 Mask Register 
Write DMA Channel 0-3 Set-Reset Mask Register 
Write DMA Channel 4-7 Set-Reset Mask Register 


Write DMA Channel 0-3 Bus Size Register 
Write DMA Channel 4-7 Bus Size Register 


Read/Write DMA Channel 0-3 Chaining Register 
Read/Write DMA Channel 4-7 Chaining Register 


Read/Write DMA Channel 0 Target Address, AO—-A15 
Read/Write DMA Channel 0 Target Address, A16-A23 
Read/Write DMA Channel 0 Byte Count, BO-B15 
Read/Write DMA Channel 0 Byte Count, B16-B23 
Read/Write DMA Channel 0 Requester Address, AO-A15 
Read/Write DMA Channel 0 Requester Address, A16- -A23 


seemmeerenes erro tent eter CA Per A PRAIA ta eo SU A Pee 
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Port Address 


(HEX) = EE RN SOE eee ee aE 


OMA CONTROLLER (Continued) 


Read/Write DMA Channel 1 Target Address, A0--A15 
Read/Write DMA Chariiiei } Target Address, A16--A23 
Read/Write DMA Charnine! 1 Byie Count, BO-B15 
Read/Write DMA Channel 1 Byte Count, B16-B23 
Read/Write DMA Channel 1 Requester Address, AO—A15 
Read/Write DMA Channel 1 Requester Address, A16—A23 


Description 


Read/Write DMA Channel 2 Target Address, AO-A15 
Read/Write DMA Channel 2 Target Address, A16--A23 
Read/Write DMA Channel 2 Byte Count, BO-B15 
Read/Write DMA Channel 2 Byte Count, Bi6-—B23 
Read/Write DMA Channel 2 Requester Address, AO-A15 
Read/Write DMA Channel 2 Requester Address, A16-—A23 


Read/Write DMA Channel 3 Target Address, AO-A15 
Read/Write DMA Channei 3 Target Address, A16—A23 
Read/Write DMA Channel 3 Byte Count, BO—-B15 
Read/Write DMA Channel 3 Byte Count, Bi6-B23 
Read/Write DMA Channel 3 Requester Address, AO-A15 
Read/Write DMA Channel 3 Requester Address, A16—A23 


Read/Write DMA Channel 4 Target Address, AO—A15 
Read/Write DMA Channel 4 Target Address, A16-A23 
Read/Write DMA Channei 4 Byte Count, BO--B15 
Read/Write DMA Channel 4 Byte Count, B16-B23 
Read/Write DMA Channel 4 Requester Address, AO-A15 
Read/Write DMA Channel 4 Requester Address, A16—A23 


Read/Write DMA Channei 5 Target Address, AO-Ai5 
Read/Write DMA Channel 5 Target Address, A16~A23 
Read/Write DMA Channel 5 Byte Count, BO-B15 
Read/Write DMA Channel 5 Byte Count, B16-B23 
Read/Write DMA Channel 5 Requester Address, AO-A15 
Read/Write DMA Channel 5 Requester Address, Ai6-A23 


Read/Write DMA Channel 6 Target Address, AO—A15 
Read/Write DMA Channel 6 Target Address, A16-—A23 
Read/Write DMA Channel 6 Byte Count, BO—B15 
Read/Write DMA Channel 6 Byte Count, B16-—B23 
Read/Write DMA Channei 6 Requester Address, AO—A15 
Read/Write DMA Channel 6 Requester Address, A16—A23 


Read/Write DMA Channel 7 Target Address, AO-A15 
Read/Write DMA Channei 7 Target Address, A16--A23 
Read/Write DMA Channel 7 Byte Count, BO-B15 
Read/Write DMA Channei 7 Byte Count, B16-B23. 
Read/Write DMA Channei 7 Requester Address, AO-A15 
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Port Address | 
(HEX) 


? — Write Bank B ICW1, OCW2 or OCW3: | 
Read Bank B Poil, Interrupt Request or In-Service 
- Status Register 
Write Bank B ICW2, ICw3, ICW4 or OCW1 
Read Bank B Interrupt Mask negisier 
Read Bank B ICW2 
‘Read/Write IRQ8 Vector Register 
Read/Write IRQ9 Vector Register 
Reserved 
Read/Write IRQ11 Vector Register 
Read/Write |RQ12 Vector Register 
Read/Write IRQ13 Vector Register 
Read/Write IRQ14 Vector Register 
Read/Write IRQ15 Vector Register 


Description 


Write Bank C ICW1, OCW2 or OCW3 
Read Bank C Poll, Interrupt nedues or In-Service 
Status Register 
_ Write Bank C ICW2, ICW3, ICW4 or OCW1 
Read Bank C Interrupt Mask Register 
Read Bank C ICW2 
Read/Write IRQ16 Vector Register 
Read/Write |RQ17 Vector Register 
Read/Write IRQ18 Vector Register 
- Read/Write IRQ19 Vector Register. 
Read/Write IRQ20 Vector Register 
Read/Write IRQ21 Vector Register 
- Read/Write IRQ22 Vector Register 
Read/Write IRQ23 Vector Register 


Write Bank A ICW1, OCW2 or OCW3 
Read Bank A Poll, Interrupt mieauest or In-Service 
Status Register 
Write Bank A ICW2, ICW3, ICW4 or -OCW1 
Read Bank A Interrupt Mask Register 
Read Bank A ICW2 
Read/Write IRQO Vector Register 
Read/Write IRQ1 Vector Register 
Read/Write |RQ1.5 Vector Register 
Read/Write IRQ3 Vector Register 
Read/Write IRQ4 Vector Register 
Reserved - 
Reserved 
| neaqi yale IRQ7 Vector Beuisies 
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Port Address 


(HEX) Description 
PROGRAMMABLE INTERVAL TIMER 
40 Read/Write Counter 0 Register 
41 Read/Write Counter 1 Register 
42 Read/Write Counter 2 Register 
43 | Write Control Word Register |—Counter 0, 1, 2 
44 Read/Write Counter 3 Register 
47 Write Word Register !I—Counter 3 
CPU RESET | | 
Write CPU Reset Register (Data—1111XXXOH) 
WAIT STATE GENERATOR _ ee 
72 Read/Write Wait State Register 0 
73 Read/Write Wait State Register 1 
74 Read/Write Wait State Register 2 
75 Read/Write Refresh Wait State Register 


DRAM REFRESH CONTROLLER oe 

1C | | Read/Write Refresh Control Register 
INTERNAL CONTROL AND DIAGNOSTIC PORTS 

61 | Write Internal Control Port 

80 Read/Write Internal Diagnostic Port 0 

88 | | Read/Write Internal Diagnostic Port 1 
RELOCATION REGISTER 


7F , Read/Write Relocation Register 
INTEL RESERVED PORTS | 


Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 
Reserved 


Sabana seems 
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APPENDIX C 
SYSTEM NOTES 


1. BHE # IN MASTER MODE. 


In Master Mode, BHE # will be activated during DMA to/from 8-bit devices residing at even locations when 
the remaining byte count is greater than 1. 


For example, if an 8-bit device is located at 00000000 Hex and the number of bytes to be transferred is > 1, 
the first address/BHE # combination will be 00000000/0. In some systems this will cause the bus controller 
to perform two 8-bit accesses, the first to 0000000 Hex and the second to 00000001 Hex. However, the 
82370’s DMA will only read/write one byte. This may or may not cause a propiem in the system depending 
on what is located at 00000001 Hex. 


Solution: 


There are two solutions if BH# active is unacceptable. Of the two, number 2 is the cleanest and most 
recommended. 


1. If there is an 8-bit device that uses DMA located at an even address, do not use that address + 1. The 
limitation of this solution is that the user must have complete control over what addresses will be used in 
the end system. | ; 


2. Do not allow the Bus Controller to split cycles for the DMA. 


82370 TIMER UNIT NOTES 


The 82370 DMA Controller with Integrated System Peripherals is functionally inconsistent with the data sheet. 


This document explains the behavior of the 82370 Timer Unit and outlines subsequent limitations of the timer 


unit. This document also provides recommended workarounds. 


1.0 WRITE CYCLES TO THE 82370 TIMER UNIT: 
This errata applies only to SLAVE WRITE cycles to the 82370 timer unit. During these cycles, the data being 


written into the 82370 timer unit may be corrupted if asynchronous CLKIN is not inhibited during a certain 
“window” of the write cycle. 
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1.1 Description 


Please refer to Figure C-2. 


During write cycles to the 82370 timer unit, the 82370 translates the 80376 interface signals such as #ADS, 
#W/R, #M/IO, and #D/C into several internal signals that control the operation of the internal sub-blocks 
(e.g. Timer Unit). 


The 82370 timer uint is controlled by such internal signa!s. These internal signals are generated and sampled 
with respect to two separate clock signals: CLK2 (the system clock) and CLKIN (the 82370 timer unit clock). 


Since the CLKIN and CLK2 clock signals are used internally to generate control signals for the interface to the 
timer unit, some timing parameters must be met in order for the interface logic to function properly. 


Those timing parameters are met by inhibiting the CLKIN signal for a specific window during Write Cycles to 
the 82370 Timer Unit. 


The CLKIN signal must be inhibited using external one as the GATE function of the 82370 timer unit is not 
guaranteed to totally inhibit CLKIN. 


1.2 Consequences 


This CLKIN inhibits circuitry guarantees proper write cycles to the 82370 timer unit. 


Without this solution, write cycles to the 82370 timer unit could place corrupted data into the timer unit 
registers. This, in turn, could yield inaccurate results and improper timer operation. 


The proposed solution would involve a hardware modification for existing systems. 


1.3 Solution 


A timing waveform (Figure C-3) shows the specific window during which CLKIN must be inhibited. Please note 
that CLKIN must only be inhibited during the window shown in Figure C-3. This window is defined by two AC 
timing parameters: 


ta = 9ns 


th = 28ns 


The proposed solution provides a certain amount of system ‘“‘guardband”’ to make sure that this window is 
avoided. 


PAL equations for a suggested workaround are also included. Please refer to the comments in the PAL codes 
for stated assumptions of this particular workaround. A state diagram (Figure C-4) is provided to help clarify 
how this PAL is designed. 


Figure C-5 shows how this PAL would fit into a system workaround. In order to show the effect of this work- 
around on the CLKIN signal, Figure C-6 shows how CLKIN is inhibited. Note that you must still meet the CLKIN 
AC timing parameters (e.g. t47 (min), tag (min)) in order for the timer unit to function properly. 


Please note that this workaround has not been tested. It is. poe as a suggested solution. Actual solutions 
will vary from system to system. 
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1.4 Long Term Plans 


Intel has no plans to fix this behavior in the 82370 timer unit. 


module Timer_82370_Fix “a a 
flag ‘er2', ‘'eq2', ‘=fl', '=t4', '=wl1l,3,6,5,4,16,7,12,17,18,15,14' 
title '82370 Timer Unit CLKIN 
INHIBIT signal PAL Solution ' 
-Timer_Unit Fix device 'P16R6'; 


"This PAL inhibits the CLKIN signal (that comes from an oscillator) 
"during Slave Writes to the 82570 Timer unit. 
: 4 : 


"ASSUMPTION: This PAL assumes that an external system address 
decoder provides a signal to indicate that an 82570 
: Timer Unit access is taking place. This input 

" signal is called IMR in this, PAL. This PAL also 

. assumes that this TMR signal occurs during a 

" specific T-State. Please see Figure 2 of this. 


. document to see when this signal is expected to 
. be active by this PAL. 
‘ | 

“" 
"NOTE: This PAL does not support pipelined 82370 SLAVE 
. cycles. 


"(c) Intel Corporation 1989. This PAL is provided as a proposed 
"method of solving a certain 82370 Timer Unit problem. This PAL 
"has not been tested or validated. Please validate this solution 
"for your system and application. | 

tt 


"Input Pins" 


CLK2 pin ; "System Clock 


1; 
RESET pin 2; "Microprocessor RESET signal 
TMR pin 3; "Input from Address Decoder, indicating 
. "an access to the timer unit of the 

"82370. . : 

IRDY pin 4; "End of Cycle indicator 
TADS pin. 5; “Address and control strobe 

CLK pin +6; "PHI2 clock 

WR pin 7; "Write/Read Signal" 

nel ‘ pin 8; "No Connect 0". 

nes pin 9; "No Connect 1" 

GNDA pin 10; "Tied to ground, documentation only 
GNDbD pin 11; "Output enable, documentation only 
CLKIN_IN pin — 12; "Input-CLKIN directly from oscillator 
"Output Pins" 

Q_0 pin 18; "Internal signal only, fed back to 

"PAL logic" 

CLKIN_OUT pin 17; "CLKIN signal fed to 82370 Timer Unit 
INHIBIT pin 16; "CLKIN Inhibit signal 

so pin 15; "Unused State Indicator Pin 

Si pin 14; "Unused State Indicator Pin 
"Declarations" 
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Valid_ADS = ADS & CLK ; "#ADS sampled in PHI1 of 80376 T-State 
Valid_RDY = RDY & CLK ; "#RDY sampled in PHI1 of 80376 T-State 
Timer_Acc’ = IMR & CLK ; "Timer Unit Access, as provided by 


"external Address Decoder" 


State_Diagram [INHIBIT, Sl, SO] 


State 000: if RESET then 000 
else if Valid_ADS & W_R then O01 
else 000; 

State OO]: if RESET then 000 


else if Timer_Acc then 010 
else if !Timer_Acc then 000 
else OO1; 


State 010: if RESET then 000 
else if CLK then 110 
else O10; 


State 110: if RESET then 000 
else if CLK then 11ll 
else 110; 


State lll: if RESET then 000 
else if CLK then Oll 
else 111; 


State Oll: if RESET then 000 
else if Valid_RDY then 000 
else 011; 


State 100: if RESET then 000 
else 000; 


State 1lOl: > if RESET then 000 
else 000; 


EQUATIONS 


QO := CLKIN_IN; "Latched incoming clock. This signal is used 
. "internally to feed into the MUX-ing logic" 


CLKIN_OUT := (INHIBIT & CLKIN_OUT & 'RESET) 
+( ‘INHIBIT & Q.0 & !RESET) ; 


"Equation for CLKIN_OUT. This. 
"feeds directly to the 82370 Timer Unit." 


END 
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ene Nr PR 


— 82370 Timer Unit CLKIN es | 
| INHIBIT signal PAL Solution 
Equations for Module Timer__82370__Fix 


Device Timer__Unit__Fix 
—Reduced Equations: | 
NNHIBIT := (ICLK & INHIBIT # CLK & SO # RESET #.1S1): 


IS1:= (RESET 
_# INHIBIT & !S1 | 
# CLK & INHIBIT & !~ RDY & SO & S11 
# ICLK & !S1 
#1S1 &!TMR 
# SO & !S1); 


ISO := (RESET 
# INHIBIT & !S1 
# CLK & INHIBIT & !~ RDY & S1 
#IINHIBIT & ISO & S1 
# ICLK & !SO_ 
# INHIBIT & !SO & S1 
# SO & !S1 
#1S1 &IW_R 
# ~ ADS & !81): 


1Q__0 := (!CLKIN__IN); 
ICLKIN__OUT := (RESET # !CLKIN_OUT & INHIBIT # IINHIBIT & !Q__0); 
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82370 Timer Unit CLKIN 
INHIBIT signal PAL Solution 
Chip diagram for Module Timer__82370__Fix 
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end of module Timer__82370__Fix 
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Figure C-2. Translation of 80376 Signais to Internal 82370 Timer Unit Signals 
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Figure C-3. 82370 Timer Unit Write Cycle 
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Figure C-4. State Diagram for Inhibit Signal _ 
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NOTE: 
This solution does not support pipelined 82370 SLAVE Cycles. 


Figure C-5. System with 82370 Timer Unit “INHIBIT” Circuitry 
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FIGURE D=5 (a): Inhibited CLKIN in an 82370 Timer Unit & CLKIN Minimum HIGH time. 
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Figure C-6. Inhibited CLKIN in an 82370 Timer Unit and CLKIN Minimum LOW Time 
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— One Integer or Control Instruction Binary Floating-Point Arithmetic 
per Clock — 386™/i486™ Microprocessor Data 
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The Intel i860T™ Microprocessor (order codes A80860-33 and A80860-40) delivers supercomputing perform- 
ance in a single VLSI component. The 64-bit design of the i860 microprocessor balances integer, floating 
point, and graphics performance for applications such as engineering workstations, scientific computing, 3-D 
graphics workstations, and multiuser systems. Its parallel architecture achieves high throughput with RISC 
_ design techniques, pipelined processing units, wide data paths, large on-chip caches, million-transistor design, 
and fast one-micron CHMOS IV silicon technology. 


A31=A3 D63—-DO CONTROL - 


BUS & CACHE 
CONTROL UNIT 


32 PHYSICAL 
ADDRESS 
64 


' 64 64 64 
FP FP FP 


i srct result | src2 


DATA BUS 


32 FP INSTRUCTION BUS | FLOATING=POINT 

a ada CONTROLLING UNIT & 
32 CORE INSTRUCTION BUS FP REGISTER FILE 

INSTRUCTION RISC CORE 


| CACHE LOW CACHE HIGH 
DATA DATA 


Ort 


INSTRUCTION ADDER UNIT 


ADDRESS 


PAGE UNIT 


GRAPHICS UNIT 


32 DATA ADDRESS 


DATA CACHE 


| 240296-1 
Figure 0.1. Block Diagram 


Intel, intel, 386, i486, i860, Multibus Il and Parallel System Bus are trademarks of Intel Corporation. 
*UNIX is a registered trademark of AT&T. OS/2 is a trademark of International Business Machines Corporation. 
** ADVANCE INFORMATION . 


; November 1990 
6-1 Order Number: 240296-004 


Intel _ i860™ MICROPROCESSOR | RELIMINARY 
TABLE OF CONTENTS 

CONTENTS | | 7 PAGE 
1.0 FUNCTIONAL DESCRIPTION ...................... hesiate oqeaGa aie tecee: tetics deiaesey ... 6-7 
2.0 PROGRAMMING INTERFACE ........................00.c0c ces e eee es eee ee 6-7 
2) WAtA TYPOS: snc cee suocinth rah dao tien tepa eae eek Raiaaa ees eee ee soma ea ween’ .. 6-8 
C2 AMER esis oe at oan cheno eelnus dla Aaa emma ted eae pee 6 Sag Shi, Sis ate .. 6-8 

2 le2 OLOMAl sti eS teed anasua ta euadaong Rhee bam se iene dciuas ih Siar Aol h te eur huis 6-8 

2 1.3 Single- and Double-Precision Real . Sohenae dived doug Se vencgia ute aden a eiaiade . 6-8 
dee PINOT cin ide ectg hd sins Samana hans ateeehe ne saseene Deeeeat ee eres 6-9 
2.2 Register Set ........... earls a eona anna age oe ea ee neediness ees 6-9 
2.2.1 Integer Register File......... Tegan iio hdetisesead awed eace Sobeschs yeu 6210 
2.2.2 Floating-Point Register File ....... sis Mgehsted Ad itata ceccavenatiii aici dO eabonat cite boc (Oot 
2.2.3 Processor Status Register ...............cccccccceeveeeeeeuevevueeeeeenennees 6-10 
2.2.4 Extended Processor Status Register.....................00 eee ee meee iaGe es, 6-13 
2.2.5 Data Breakpoint Register .............. ene eee eee ee iotgeh eoeseeeds ceceeees 614 
2.2.6 Directory Base Register .................... Acigsherbiuntecne orcas veceeee 6-14 
2.2.7 Fault Instruction Register .........0... cece eee eee ee oes ee eee 6-15 
2.2.8 Floating-Point Status Register ............. 0.00.0 c ccc eece eee treet nncesice cece es 6-15 
2.2.9 KR, KI, T, and MERGE Registers Sarmihepauedltenee sain aati te aah sbt he erent 6-16 | 
PW Volo a1 6 EN ee re TR Te TY ee eu ae aca e coin 6-17 
2.4 Virtual Addressing ................. Sac Aashatinds Seats aie nade eaten aeeoneses 6-17 
2.4.1 Page Forms...............66 teenie petons Alsat eee ace ae eee eke 6-19 
2.4.2 Virtual Address ................00 0c cc eae ee error Ghtieneatea ses 6-19 
2.4.3 Pages Tables ......... L soetee tenet gest pines Pieaeaas ban ote Lakaate oe 6-19 
2.4.4 Page-Table Entries ............. ccc cee eee ee ee eieetaaan Mea eee 6-20 
2.4.4.1 Page Frame Address ............. 0... cece ccc cee nt e neces Siete tes - 6-20 

2AA 2 PICSON Bilvids insert waG uw teehee sadness eeed eee tantan ag eta wea eata es 6-20 

2.4.4.3 Writable and User Bits 0.00.00... 0. cece cece cece cece eee eeeueeeeeeas 6-20 

2.4.4.4 Write-Through Bit ......0... cece cece eee ees cawepsee aes 6-21 

2.4.4.5 Cache Disable Bit........... pha iiads ntesteameuweteeess eee 6-21 

2.4.4.6 Accessed and Dirty Bits................. 0. cece eee eee Coe ee eee 6-21 

2.4.4.7 Combining Protection of Both Levels of Page Tables.................... 6-21 

2.4.5 Address Translation Algorithm ......... 0... ccc cc ccc cee eee e cece ec eeeeeenaas 6-22 
2.4.6 Address Translation Faults ............... 00.0.0 cece eee Ni Sek ot eile tg: ds charter a 6-23 
2.4.7 Page Translation Cache .......0.. 00. cc ccc eee e ee ce eee ec enbeenes Cacia erates 6-23 
2.5 Caching and Cache Flushing ........ Se ee re eer ee ee eee ee a ee eemens .. 6-23 
2 OANSINICUON SOls és occur era nineienersd tee eeaniaetcinrs eeweesands ec ered aaa red ntecire 6-24 
2.6.1 Pipelined and Scalar Operations ..................00. bid nepal ace een dt eaees 1. 6-24 
2.6.1.1 Scalar Mode ................. bt sec aukentea senor eadainpencuakatsuesus wees 6-24 

2.6.1.2 Pipelining Status Information ........... Etna Secon aouasheare nace Doped 6-24 

2.6.1.3 Precision in the PIpPCliNGS <is000.netsesusewiadadeendeenees Laas Siiiaranse O20 

2.6.1.4 Transition between Scalar and Pipelined Operations ........... negudees “C21 


6-2 


intel ¢ i860™ MICROPROCESSOR PRELIMINARY 


ht ACO Apt RR RAR OT RE RE RR CET UNE CARR ON LE OTE PG A AY mi A me me 


CONTENTS | PAGE 
2.6.2 Dual-Instruction Mode ..................... iano temas ee Ca OS 6-27 
2.6.3 Dual-Operation Instruction ........0.00. 0.00.0 ccc cece neces RT ROA ae 6-28 

2. FAGOLESSING MOdES 455 .055ac eee sen te th bias epee yok tow ene ok Sewanee teieenkae ae 6-28 
2.8 Traps and: IMlGmMmUptsS +464 eect deen Sieh eheednnd oe bnd ee bea yeeeee 4 eee en ere. 6-29 
2.8.1 Trap Handler Invocation ........0....0 00 eee nee e nee eeeeeeneeeees 6-29 
28:2 NSUUCION-FAUlteatinudscayttn ces sae bag cde ese taaddt heed e ed geewads 6-30 
2:0.3 Floaling-POINT Faults ciacscey tient need cesde aah ch ova gan gndeeeneweeiey Satene aps 6-30 
2.8.3.1 Source Exception Faults ............ 0.0.0.0... ccc ccc eee ees seein Late ee 6-30 
2.8.3.2 Result Exception Faults ..............00..cccceeeeeeees eer err cree 6-30 
2.8.4: InStruction ACCESS Faull: icc os yc5inearcawdik chaaedandGaeaeeeddees beasereweedes 6-31 
2.8.5 Data Access Fault. ....00...... 0.0 cc cece cen eeeeeeeeeeeees Retake tan hgaesaies 6-31 
2.8.6 Interrupt Trap ........... Sdn eh Hee hace AN Awe hac ans seen eae bAtue Aedes nehateces 6-31 
2.8.7 Reset Trap ................08. tice dunia tidas aiod atau ries pean ethew heceee: 6-31 
2.9 Debugging...................eee. ee ee ee eee ee re ee eT ree 6-32 
3.0 HARDWARE INTERFACE................... sey See Sadeaed wat eee Suen doles Masia 6-32 
33 SIGNal' DESCDUON: ccc eanca canes adresse soieha die vielehesadaadsced actin sea ennees cad ens 6-32 
Bile COC (OLIN) da ow diane copter se pen alone tnem td Orne ona ae etek eeemte saws tees 6-32 
3.122 System Reset (HESEN) i025 bint Seeds s dex pace ¥ caddies deta eas seshmedadsenn ¥uawe 6-32 — 
3.1.3 Bus Hold (HOLD) and Bus Hold Ackaowiedas (HLDA). es Uiietagen eelaaeeas Stceseeies 6-32 
3.7.4 Bus Request (BREQ) 420 0sc cies ea oeia set tdeanitbesens ita ees iebsipuse ti ae ease 6-33 
3.1.5 Interrupt/Code-Size (INT/CS8) .......... 0. ccc cence eee e eens ree 6-33 
3.1.6 Address Pins (A31~—A3) and Byte Enables (BE7#—BEO#) .................5. .. 6-34 
3: te? Dala Pins (063 D0) 324 cnt eoicaudnaractavbewsaeudehse bases howe bs euey baaeaned 6-34 
3.156 BUS LOCK LOCK F ) iancacs rtua aun S o eateGPeae See aa Peewee 6-34 
3.1.9 Write/Read Bus Cycle (W/R#) ...... 0.0 ccc cece cece tee e te eettee ses 65 
3.1.10 Next Near (NENE#) ........ 0.00.00. ccc cece eee n neces etree enn e eee sasteaahat 6-35 
3.1.11 Next Address Request (NA#) ................... eve ater Piles eon he 6-35 
3.1.12 Transfer Acknowledge (READY #).................. 0c. c cece ees er es 6-35 
3.1.13 Address Status (ADS#) ...........0 00. c cece cece ce ccecuccecseteveneceseesats 6-35 
3.1.14 Cache Enable (KEN #) xcs isc ein Sei dete iw ba oe lancdae) ie sade be pees amen 6-35 
3.1.15 Page Table Bit (PTB) ............ 0... cece ee cccceeeeeecevevseceeeeeseeeeeeeses 6°36 
3.1.16 Boundary Scan Shift Input (SHI)... 0... cece nce e eee enaees 6-36 
3.1.17 Boundary Scan Enable (BSCN)..................000005. cateaduoreaiee: Pe 6-36 
3.1.18 Shift Scan Path (SCAN) .......0. 00. e eens eerioatun wadak 6-36 
3.1.19 Contiguration (GCG1—=CC0). soos iienguis bv Gavan eke caus sets ease waeee 6-36 
3.1.20 System Power (Voc) and Ground (Vss) ..........2-. ee eee Dadiewenes ae eae 6-36 
Be IMIUAIIZANON ate densa eae ons oes omiceenedoateueueseoaceuaeladiesoe aeudenesees 6-36 
3.3 Testability ..... TE Le Se eT Cae TRONS eee eee ee ER ee Te eee Te sos Ay Su Sa eeeenttoes 6-37 
B31) NOKMAl W006: sinc ietecsita. teste hore ond nna Seed auabeweus ats 6-38 
3.3.2 Shift Mode ................ eciabipenbeeeerndtt et atatic ne oak oationoeumaenceed 6-38 


intel —— - is6oT™ MICROPROCESSOR PRELIMINARY 


CONTENTS ; rn 7c = 


4.0 BUS OPERATION... 00.0.0 occ ccc ccec cece cece cecseeeeeeeuesesueetevesieeeeenens 6-38 
41 Pipelining...............c0.ceeee eee Fe ere onosine de ehaecues reer eee eee 6-38 
4.2 Bus State Machine...................... pierre anil ee er ree cee bidders 6-39 — 
4.3 Bus Cycles ........ Fiat s OEE onda sada Aaah eae tne eeeeatnene aaa aah esaes 6-41 
4.3.1 Nonpipelined Read Cycles............... ccc cece cece teen e ne eee pet Senittaae 6-41 
4.3.2 Nonpipelined Write Cycles................. ccc eee eee Oat Weta baie asst on 6-42 
4.3.3 Pipelined Read and Write Cycles ........... 0.0 ccc cece cece cece ees he Bewinweaweg a Ose 
4.3.4 Locked CycleS ........... cece cece cece eee eee ee anes id ecetndas Wrest weet es and 6-46 
4.3.5 HOLD and BREQ Arbitration Cycles ............. 0... cece cece Sneatseenen 6-46 
4.4 Bus States during RESET .......... hae hese donee. eet enn Nechianas 6-47 
5.0, MECHANICAL DATA cco cardi rid exes teblawiw ded calves bed Cae enn dae ees 6-48 
6.0 PACKAGE THERMAL SPECIFICATIONS .................. ccc cece cece cece e teen teens 6-53 
7.0 ELECTRICAL DATA ....... We iehaseiep es bet aceysdiecan tases des oo gesdn Seles ee 6-55 
7.1 Absolute Maximum Ratings .............. 2... cece cence eee e eens Sree soles tare ali een tateaten B55 
7.2: 1D.G: Characteristics: socccicoo dessa bi etelavetaas seve keuiacaaes Deh ae veesewseces 6°55 
7.3 A.C. Characteristics ............. Fe ee het ere eee hk eloie asda torna tere 0-00 
8.0 INSTRUCTION SET............. Sa Alclad nent ede tenn & eoenaesk Bera aauqee ee eueue Ssereaaniese 6-59 
8.1 Instruction Definitions in Alphabetical Order ...........0 ccc cceeeeeeeceeeeeeees oe 6-60 
8.2 Instruction Format and Encoding ..............0.c00ceeeeeeee ees Sieeaes the ceeeme ac OOF 
8.2.1 REG-Format Instructions ...............0cececeeeceeeeeeees Sau se yi aeet ee gaxe 6-67 
8.2.2 CTRL-Format Instructions ............c.cceeeecceeeeccecenueneees ise acenaetes 6-70 
8.2.3 Floating-Point Instructions ............ re nee eer arene eae 6-71 
8.3 Instruction TimingS...........0000c cece eee c ccc eeee eee e ee es Asi atase ies Rcleieiecatne 6-73 
8.4 Instruction Characteristics............... oe ER A eee re iiesbartan B16 


intel | i860™ MICROPROCESSOR PRELIMINARY 


FIGURES PAGE 
Figure: 0:1 (Block DiaGranm:.21in2553-5 0a dws ete el taeeRi Sees tase gies Guede aaa hasouas aos 6-1 
Figure 2:1: Real Number Formats s.se.2+ cotoesc4etaderecnned Wes feadueka sehheene se Giese ae ets 6-8 
Figure 2.2 Pixel Format Example .............. 0.0... e ccc cee eee maa ltunedats madede 6-9 
Figure:2.3 ““Aegistersiand Data PainSi.2.4 gigi Ori et yhoo eee eee cass Aaaee katona 6-11 
Figure:2.4.) ‘Processor status HeGIStel 40.2.4 vans oni cwhee Pa Oh onan s ava eee el Aeas 6-12 
Figure 2.5 Extended Processor Status Register ............ 0. cece eee center eee 6-12 
Figure 2.6 Directory Base Register ............ 0... ete enn ene e eee nas 6-13 
Figure 2.7 Floating-Point Status Register ............ 0... ccc eeene enn ce enaes 6-15 
Figure 2.8 Little and Big Endian Data AcceSS ........... 0... cece cece ene naes 6-18 
Figure 2.9 Format of a Virtual Address ....... 0... c ccc eee eee nnn n ne enes 6-19 
Figure 2.10 Address Translation ©... 0. ene n ent e eben eens 6-19 
Figure 2.11 Format of a Page Table Entry.............. 0.0. cece ccc cnet nas 6-20 
Figure 2.12 Pipelined Instruction Execution ...... 00... eect een neeneees 6-26 
Figure 2.13 Dual-Instruction Mode Transitions .........00 000 ccc ccc tenn en eens 6-27 
Figure 2.14 Dual-Operation Data Paths «2.0.0.0... ccc ccc cere cent eee beeen nes 6-28 
Figure 3.1 Order of Boundary Scan Chain... 2.0.00... 6c cent enn eens eens 6-38 
Figure 4.1 Bus State Machine ......................006. Ce en eee Reece es er eet 6-40 
Figure 4.2 Fastest Read Cycles ............. 0.00... c cece ee, eis Berge Jeans mieeves 6-41 
Figure 4.3 Fastest Write Cycles 0.0.0.0... 0c cece cece ccc cence cece etn eben cent beeen 6-42 
Figure 4.4 Fastest Read/Write Cycles ..........0..... 00. c cece eee A dengiee teint ne attaaees 6-43 
Figure 4.5 Pipelined Read Followed by Pipelined Write .............. 0... ene eee 6-43 
Figure 4.6 Pipelined Write Followed by Pipelined Read................... Semele amen aewer 6-44 
Figure 4.7 Pipelining Driven by NA#....... Reardon local aa neeen sete teens eta ea etes aa 


Figure 4.8 NA# Active with No Internal Bus Request 
Figure 4.9 Locked Cycles ............ eye eh Es Be a Tanto eceasersts ametee 


Figure-4-10 HOLDe ALA. and BRE GO xs uveisntundesy ons nts ee uieates Keone area bae dt 

Figure 4.11 Reset Activities. .... Po ee eee eee ete ee ee ere ee 

Figure 5.1 Pin Configuration—View from Top Side .............. 0. cece cee eee eee 

Figure 5.2 Pin Configuration—View from Pin Side .....................05. sn aie Epa eae ate ets 

Figure 5.3 168-Lead Ceramic PGA Package Dimensions ...................005. eaten ae 6-53 
Figure 6:1 Ice vs:Case Temperature s son cieauines Vs cadet eicaia teed otanve ladies dhe eiewssae’ 6-54 
Figure 7.1 CLK, Input, and Output Timings .......0..0.0.00 00. cnet ene e ne nnes 6-57 
Figure 7.2 Typical Output Delay vs Load Capacitance under Worst-Case Conditions ........ 6-58 
Figure 7.3 Typical Slew Time vs Load Capacitance under Worst-Case Conditions........... 6-58 
Figure: 7.4. ‘Typical Ino VS Frequency. wi<cis6decsenhsses vow leeines Hla wladenadhadeiategseeaee 6-58 
Figure 8.1 REG-Format Variations ...........0.... 000 cece ence eens yaa tah ees en hes 6-68 
Figure 8.2 Core Escape Insiruction Format .........0 0.00 cece teen eens 6-69 
Figure 8.3 CTRL Instruction Format .................... ash aeie gusta Hen aee ht Ge aoreeae 6-70 
Figure 8.4 Floating-Point Instruction Encoding .......00.. 0.0... ccc ccc eee nee enes . 6-71 


6-5 


a 


TABLES 


' Table 2.1. 
Table 2.2 
Table 2.3 
Table 2.4 
Table 2.5 
Table 2.6 
Table 2.7 
Table 2.8 
Table 2.9 
Table 3.1 
Table 3.2 
Table 3.3 
Table 3.4 
Table 3.5 
Table 3.6 
Table 5.1 
- Table 5.2 
Table 5.3 
- Table 6.1 
Table 6.2 
Table 7.1 
Table 7.2 
Table 8.1 
Table 8.2 
Table 8.3 
Table 8.4 
Table 8.5 
Table 8.6 
Table 8.7 
Table 8.8 
Table 8.9 


i860™ MICROPROCESSOR PRELIMINARY 
PAGE 
PIXGMPONMAIS we venettei dee aceetiiehs Doce tea edat cl arneoaseoarans las eumes 6-9 
ValUCS OF PS 22d ee coeds t eht a ui ee tae bune anes boda eae a eestor usnawawes 6-13 
Wallies Ol Ot a.ciantiiaoutenaeacoonpa caidas pumatue iuettcountorseesencaatas veces 615 
Values Of AG ya 2é20c nck oun uahehwutaueeedesad eboe Ind ccpilacumeennc: Conceeae id ... 6-15 
Values OP AM ssuccoseiisecidverS$2cud es aeadieestiestaeas ahincteten Rites Saale ie 6-16 
Combining Directory and Page Protection ...................05. ere eee 6-22 
INSIEUCHION:- OGL, 3.054 Sue see etgaa neti tude teen neaensres ets seus asiebins 6-25 
TVOOS Ol WANS niiste ee cA aw atsld oho hee asa tee aes OLS wa aks .. 6-29 
Register and Cache Values after Reset ............... Geatnstaee: tes weeete. 6-82 
Pil SUMIMALY exjica ue achat hiccsaderoudwhsewenue ere tonuens Sienoeaayanents 6-33 
Indentifying Instruction Fetches ............. 0... cece ee eeeee eee ees eacettate aie . 6-35 
Cacheability based on KEN# and CD OR’ed WT.................. spate ace nels wien “B-00 
Output Pin Status during RESET ............... ccc cece cece ence ene eenes be vaasene 6-37 
Test Mode Sel6ChOn -seschascocxapeteudetcsertetdenetadlveopinscuetaadd sided: 6-37 
Test Mode Latches............. 0... ccc eee ee eee Bete ar avian, os a en eS aa 6-37 
Pin Cross Reference by Location .................00800% eer ree 6-50 
Pin Cross Reference by Pin Name ............ccccccceeeeueceeeceeeeneueeeeennas 6-51 
Ceramic PGA Package Dimension Symbols ................... Dee rere See . 6-52 
8ca at Various Airflows and @jc..... ee eee ee eee ocetetananataa 6-54 
Maximum Ta at Various Airflows ................ 00. c eee ee eee eee 6-55 
DG: CHALECIONSUCS: cicasincamaneapeie ach korea leks Calatemebneredenneent 6-55 
PS CalacClOniSicsSctai cl cowuue ore casa lonnea tesa oa eee Seen en Awana was 6-56 
PIECISIOM SPCCIICAUON 2icvcauxdutiusaoreaieest outs eens emhan ousotegmesabeeds 6-59 
FADDP MERGE Update ...................... eae aie os tata le eeaeemmanaes eae 6-67 | 
REGISISLENCOdING cin 02531. c5nNadamiewetarss Midadianeur seat adiaaks Ree ee ee 6-67 
REG-Format Opcodes............. ccc cceccceccceucceeucceunceuceutecenreenneeean 6-69 
Core Escape Opcodes ..:.............. Ais iidceko Nigam tsaetere aiee eur 6-70 
CIRL-Format OPCod6S «4.45 ncwatie x ww iuesaatekinmaminage edasets tae sonutavwa O70 
FiOaung-Point OPCOdeS = 4 xi cae ita ses niga betel heated eee: 6-71 
DEC SNCOdING)-2:.c52 sus aed Gunn eoenasien ns eiiatedles aechuauamastageanenteaas 6-72 
6-77 


Instruction Characteristics ....................005. oie ore ree 


6-6 


intel 


1.0 FUNCTIONAL DESCRIPTION 


As shown by the block diagram on the front page, 
the 1860 microprocessor consists of 9 units: 


. Core Execution Unit 

. Floating-Point Control Unit 

. Floating-Point Adder Unit 

. Floating-Point Multiplier Unit 
. Graphics Unit 

. Paging Unit 

. Instruction Cache 

. Data Cache 

. Bus and Cache Control Unit 


OON OR WON — 


The core execution unit controls overall operation of 
‘the i860 microprocessor. The core unit executes 
load, store, integer, bit, and control-transfer opera- 
tions, and fetches instructions for the floating-point 
unit as well. A set of 32 x 32-bit general-purpose 
registers are provided for the manipulation of integer 
data. Load and store instructions move 8-, 16-, and 
32-bit data to and from these registers. Its full set of 
integer, logical, and controi-transfer instructions give 
the core unit the ability to execute complete systems 
software and applications programs. A trap mecha- 
nism provides rapid response to exceptions and ex- 
ternal interrupts. Debugging is supported by the abili- 
ty to trap on data or instruction reference. 


The floating-point hardware is connected to a sepa- 
rate set of floating-point registers, which can be 
accessed as 16 x 64-bit registers, or 32 x 32-bit reg- 
isters. Special load and store instructions can also 
access these same registers as 8 x 128-bit registers. 
All floating-point instructions use these registers as 
their source and destination operands. 


The floating-point control unit controls both the float- 
ing-point adder and the floating-point multiplier, issu- 
ing instructions, handling all source and result 
exceptions, and updating status bits in the floating- 
point status register. The adder and multiplier can 
operate in parallel, producing up to two results per 
clock. The floating-point data types, floating-point in- 
structions, and exception handling all support the 
IEEE Standard for Binary Floating-Point Arithmetic 
(ANSI/IEEE Std 754-1985). 


The floating-point adder performs addition, subtrac- 
tion, comparison, and conversions on 64- and 32-bit 
floating-point values. An adder instruction executes 
in three clocks; however, in pipelined mode, a new 
result is generated every clock. 


The floating-point multiplier performs floating-point 
and integer multiply and floating-point reciprocal op- 
erations on 64- and 32-bit floating-point values. A 
multiplier instruction executes in three to four clocks; 
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however, in pipelined mode, a new result can be 
generated every clock for single-precision and evely 
other clock for double precision. 


The graphics unit has special integer logic that sup- 
ports three-dimensional drawing in a graphics frame 
buffer, with color intensity shading and hidden sur- 
face elimination via the Z-buffer algorithm. The 
graphics unit recognizes the pixel as an 8-, 16-, or 
32-bit data type. It can compute individual red, blue, 
and green color intensity values within a pixel; but it 
does so with parallel operations that take advantage 
of the 64-bit internal word size and 64-bit external 
bus. The graphics features of the i860 microproces- 
sor assume that the surface of a solid object is 
drawn with polygon patches whose shapes approxi- 
mate the original object. The color intensities of the 
vertices of the polygon and their distances from the 
viewer are known, but the distances and intensities 
of the other points must be calculated by interpola- 
tion. The graphics instructions of the i860 microproc- 
essor directly aid such interpolation. 


The paging unit implements protected, paged, virtual 
memory via a 64-entry, four-way set-associative 
memory called the TLB (Translation Lookaside Buff- 
er). The paging unit uses the TLB to perform the 
translation of logical address to physical address, 
and to check for access violations. The access pro- 
tection scheme employs two levels of privilege: user. 
and supervisor. 


The instruction cache is a two-way set-associative 
memory of four Kbytes, with 32-byte blocks. It trans- 
fers up to 64 bits per clock (400 Mbyte/sec at 
50 MHz). 


The data cache is a two-way set-associative memo- 
ry of eight Kbytes, with 32-byte blocks. It transfers 
up to 128 bits per clock (800 Mbyte/sec at 50 MHz). 
The i860 microprocessor normally uses writeback 
caching, i.e. memory writes update the cache (if ap- 
plicable) without necessarily updating memory im- 
mediately; however, caching can be inhibited by 
software where necessary. 


The bus and cache control unit performs data and 
instruction accesses for the core unit. It receives cy- 
cle requests and specifications from the core unit, 
performs the data-cache or instuction-cache miss 
processing, controls TLB translation, and provides 
the interface to the external bus. Its pipelined struc- 
ture supports up to three outstanding bus cycles. 


2.0 PROGRAMMING INTERFACE 


The programmer-visible aspects of the architecture 
of the i860 microprocessor include data types, regis- 
ters, instructions, and traps. 


intel 


2. 1 Data Types 


The i860 microprocessor provides operations for in- 

teger and floating-point data. Integer operations are 
performed on 32-bit operands with some support 
also for 64-bit operands. Load and store instructions 
can reference 8-bit, 16-bit, 32-bit, 64-bit, and 128-bit 
operands. Floating-point operations are performed 
on IEEE-standard 32- and 64-bit formats. Graphics 
oriented instructions operate on arrays of 8-, 16-, or 
32-bit pixels. | 


2.1.1 INTEGER 


An integer is a 32-bit signed value in standard two’s 
complement form. A 32-bit integer can represent a 
value in the range —2,147,483,648 (—231) to 
2,147,483,647 (+231 — 1). Arithmetic operations on 
8- and 16-bit integers can be performed by sign-ex- 
tending the 8- or 16-bit values to 32 bits, then using 
the 32-bit operations. 


There are also add and subtract instructions that op- 
erate on 64-bit long integers. 


Load and store instructions may also reference (in 
addition to the 32- and 64-bit formats previously 
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2.1.2 ORDINAL 


Arithmetic operations are available for 32-bit ordi- 


~ nals. An ordinal is an unsigned integer. An. ordinal 


mentioned) 8- and 16-bit items in memory. When an — 


8- or 16-bit item is loaded into a register, it is con- 


verted to an integer by sign-extending the value to — 


32 bits. When an 8- or 16-bit item is stored from a 
register, the corresponding number of low-order bits 
of the register are used. 


can represent values in the to 


| range 0O 
4,294,967,295 (+232 — 1). 

Also, there are add and subtract instructions that Op- 
erate on 64-bit ordinals. 


2.1.3 SINGLE- AND DOUBLE-PRECISION REAL 


Figure 2.1 shows the real number formats. A single- 
precision real (also called ‘single real’) data type is 
a 32-bit binary floating-point number. Bit 31 is the 
sign bit; bits 30..23 are the exponent; and bits 22..0 
are the fraction. In accordance with ANSI/IEEE 
standard 754, the value of a single- Preceen real is 
defined as follows: 


1. Ife = Oandf # 0 ore = 255 then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 


2.1f0<e< 255, Veena vee 1)S X 1.f Xx 
9e- 127. 


3. Ife = 0 and f = 0, then the value is signed zero. 


A double-precision real (also called “double real’) 
data type is a 64-bit binary floating-point number. Bit 
63 is the sign bit; bits 62..52 are the exponent; and 
bits 51..0 are the fraction. In accordance with ANSI/ 
IEEE standard 754, the value of a double-precision 
real is defined as follows: 


1. If e = 0 andf + 0 ore = 2047, then generate a 
floating-point source-exception trap when en- 
countered in a floating-point operation. 


2. If 0 < e < 2047, then the value is (—1)S x 1.f x 
9e— 1023. 


Single-Precision Real 


aa —_ 


FRACTION 
EXPONENT 
SIGN 
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| Double-Precision Real 


FRACTION 
EXPONENT 
SIGN 
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3. Ife = 0 and f = 0, then the value is signed zero. 


The special values infinity, NaN (‘Not a Number’), 
indefinite, and denormal generate a trap when en- 
countered. The trap handler implements IEEE-stan- 
dard results. 


A double real value occupies an even/odd pair of 
floating-point registers. Bits 31..0 are stored in the 
even-numbered floating-point register; bits 63..32 
are stored in the next higher odd-numbered floating- 
point register. 


” 


2.1.4 PIXEL 


A pixel may be 8, 16, or 32 bits long depending on 
color and intensity resolution requirements. Regard- 
less of the pixel size, the i860 microprocessor al- 
ways operates on 64 bits worth of pixels at a time. 
The pixel data type is used by two kinds of instruc- 
tions: 


e The selective pixel-store instruction that helps im- 
plement hidden surface elimination. 


e The pixel add instruction that helps implement 
3-D color intensity shading. 


To perform color intensity shading efficiently in a va- 
riety of applications, the i860 microprocessor de- 
fines three pixel formats according to Table 2.1. 


Figure 2.2 illustrates one way of assigning meaning 


to the fields of pixels. These assignments are for. 


illustration purposes only. The i860 microprocessor 
defines only the field sizes, not the specific use of 
each field. Other ways of using the fields of pixels 
are possible. 
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Table 2.1. Pixel Formats 


Bits of 
Other 
Attribute 


Bits of 
Color 1 


(in bits) | Intensity | Intensity | Intensity (Texture) 


The intensity attribute fields may be assigned to colors in 
any order convenient to the application. 


*With 8-bit pixels, up to 8 bits can be used for intensity; the 
remaining bits can be used for any other attribute, such as 
color. The intensity bits must be the low-order bits of the 
pixel. 


(2.2 Register Set 


As Figure 2.3 shows, the i860 microprocessor has 
the following registers: 


e An integer register file 

e A floating-point register file 

e Six control registers (psr, epsr, db, dirbase, fir, 
and fsr) 


e Four special-purpose Cone (KR, io T, and | 
MERGE) 


The control registers are accessible only by load 
and store control-register instructions; the integer 
and floating-point registers are accessed by arithme- 
tic operations and load and store instructions. The 
special-purpose registers KR, Kl, T, and MERGE are 
used by a few specific instructions. 


52=BIT PIXEL 
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|—Intensity, R—Red intensity, G—Green intensity, B—Blue intensity, C—Color, T—Texture 
These assignments of specific meanings to the fields of pixels are for Mustation purposes only. Only the field sizes are 


defined, not the specific use of each field.: 


Figure 2.2. Pixel Format Example 
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2.2.1 INTEGER REGISTER FILE 


There are 32 integer registers, each 32 bits wide, 
referred to as r0 through r31, which are used for 
address computation and scalar integer computa- 
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tions. Register r0 always returns zero when read, | 


independently of what is Stored in it. 


2.2.2 FLOATING-POINT REGISTER FILE 


There are 32 floating-point registers, each 32-bits 
wide, referred to as f0 through f31, which are used 
for floating-point computations. Registers f0 and f1 
always return zero when read, independently of 
what is stored in them. The floating-point registers 
are also used by a set of graphics operations, pri- 
marily for 3D graphics computations. 


When accessing 64-bit floating-point or integer val- 
ues, the i860 microprocessor uses an even/odd pair 
of registers. When accessing 128-bit values, it uses 
an aligned set of four registers (f0, #4, f8,... , 28). 
The instruction must designate the lowest register 
number of the set of registers containing 64- or 128- 
bit values. Misaligned register numbers produce un- 
defined results. The register with the lowest number 
contains the least significant part of the value. For 
128-bit values, the register pair with the lower num- 


bers contain the least significant 64 bits while the | 


register pair with the higher numbers contain the 
most significant 64 bits. | 
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The 128-bit load and store instructions, along with 
the 128-bit data path between the floating-point reg- 
isters and the data cache help to sustain the extraor- 
dinarily high rate of computation. ? 


2.2.3 PROCESSOR STATUS REGISTER — 


The processor status register (psr) contains miscel- 
laneous state information for the current process. 
Figure 2.4 shows the format of the psr. 


e BR (Break Read) and BW (Break Write) enable a 
data access trap when the operand address 
matches the address in the db register and a 
read or write (respectively) occurs. 


_Various instructions set CC (Condition Code) ac- 
cording to tests they perform. The branch-on- 

_ condition-code instructions test its value. The bla 
instruction sets and tests LCC (Loop Condition 
Code). 


IM (Interrupt Mode) enables serial interrupts. if 
set; disables interrupts if clear. 


U (User Mode) is set when the i860 microproces- 
sor is executing in user moae; it is clear when the 
i860 microprocessor is executing in. supervisor 
mode. in user mode, writes to some control regis- 
ters are inhibited. This bit also controls the mem- 
ory protection mechanism. See section 2.4.4.3 
for a description of memory protection in user 
and supervisor modes. 
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‘BREAK WRITE 

CONDITION CODE 
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INTERRUPT MODE 
PREVIOUS INTERRUPT MODE 
USER MODE — 
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INSTRUCTION TRAP 
INTERRUPT 

INSTRUCTION ACCESS TRAP 
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DUAL INSTRUCTION MODE ea | = 
x : r : 
TIT 


“TCLs 


KILL NEXT FLOATING-POINT INSTRUCTION 
(RESERVED) 

SHIFT COUNT 

PIXEL SIZE 

PIXEL MASK 
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*Can be changed only from supervisor level. 


Figure 2.4 Processor Status Register 
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Figure 2.5 Extended Processor Status Register 


e PIM (Previous Interrupt Mode) and PU (Previous 
User Mode) save the corresponding status bits 
(IM and U) on.a trap, because those status bits 
are changed when a trap occurs. They are re- 
stored into their corresponding status bits when 
returning from a trap handler with a branch indi- 
rect instruction when a trap flag is set in the psr. 


FT (Floating-Point Trap), DAT (Data Access 

Trap), IAT (Instruction Access Trap), IN (Inter- 

rupt), and IT (Instruction Trap) are trap flags. 

They are set when the corresponding trap condi- 

tion occurs. The trap handler examines these bits 

to determine which condition or conditions have 
caused the trap. 
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e DS (Delayed Switch) is set if a trap occurs during 
the instruction before dual-instruction mode is en- 
tered or exited. If DS is set and DIM (Dual Instruc- 
tion Mode) is clear, the i860 microprocessor 
switches to dual-instruction mode one instruction 
after returning from the trap handler. If DS and 
DIM are both set, the i860 microprocessor 
switches to single-instruction mode one instruc- 
tion after returning from the trap handler. 


When a trap occurs, the i860 microprocessor 
sets DIM if it is executing in dual-instruction 
mode; it clears DIM if it is executing in single-in- 
struction mode. If DIM is set after returning from a _ 
trap handler, the i860 microprocessor resumes 
execution in dual- instruction mode. 
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e When KNF (Kill Next Floating-Point Instruction) is 
set, the next floating-point instruction is sup- 
pressed (except that its dual-instruction mode bit 
is interpreted). A trap handler sets KNF if the 
trapped floating-point instruction should not be 
reexecuted. 


e SC (Shift Count) stores the shift count used by 
the last right-shift instruction. It controls the num- 
ber of shifts executed by the double-shift instruc- 
tion. 


e PS (Pixel Size) and PM (Pixel Mask) are used by 
the pixel-store instruction and by the graphics in- 
structions. The values of PS control pixel size as 
defined by Table 2.2. The bits in PM correspond 
to pixels to be updated by the pixel-store instruc- 
tion pst.d. The low-order bit of PM corresponds 
to the low-order pixel of the 64-bit source oper- 
and of pst.d. The number of low-order bits of PM 
that are actually used is the number of pixels that 
fit into 64-bits, which depends upon PS. If a bit of 
PM is set, then pst.d stores the corresponding 
pixel. Refer also to the pst.d instruction in section 
8. 


Tabie 2.2. Values of PS 


16 
32 
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ADDRESS TRANSLATION ENABLE 
DRAM PAGE SIZE 

BUS LOCK 

l-CACHE, TLB INVALIDATE 
(RESERVED) 
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2.2.4 EXTENDED PROCESSOR STATUS 
REGISTER 


The extended processor status register (epsr) con- 
tains additional state information for the current pro- 
cess beyond that stored in the psr. Figure 2.5 shows 
the format of the epsr. 


e The processor type is one for the i860 microproc- 
essor. 


e The stepping number has a unique value that dis- 
tinguishes among different revisions of the PICE: 
essor. 


e iL (Interlock) is set if a trap 0 occurs after a lock 
instruction but before the load or store following 
the subsequent unlock instruction. IL indicates to 
the trap handler that a locked sequence has 
been interrupted. When the trap handler finds IL 
set, it should scan backwards for the lock in- 
struction and restart at that point. The absence of 
a lock instruction within 30-33 instructions of the 
trap indicates a programming error. 


e WP (write protect) controls the semantics of the 
W bit of page table entries. A clear W bit in either 
the directory or the page table entry causes 
writes to be trapped. When WP is clear, writes 
are trapped in user mode, but not in supervisor 
mode. When WP is set, writes are trapped in both 
user and supervisor modes. After the value of the 
WP bit is changed, the TLB must be invalidated 
by setting the ITI bit of the dirbase register, be- 
fore any stores are performed. 


INT (Interrupt) is the value of the INT input pin. 


DCS (Data Cache Size) is a read-only field that 
tells the size of the on-chip data cache. The num- | 
ber of bytes actually available is 212+ DCS; there- 
fore, a value of zero indicates 4 Kbytes, one indi- 
cates 8 Kbytes, etc. 


REPLACEMENT BLOCK 
REPLACEMENT CONTROL se : 
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Figure 2.6. Directory Base Register 
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e PBM (Page-Table Bit Mode) determines which bit 
of page-table entries is output on the PTB pin. 
When PB\M is clear, the PTB signal reflects bit CD 

of the page-table entry used for the current cycle. 
When PBM is set, the PTB signal reflects bit WT 
__ of the page-table entry used for the current cycle. 


e BE (Big Endian) controls the ordering of bytes 
_within a data item in mernory. Normally (i.e. when 
BE is clear) the i860 microprocessor operates in 
little endian mode, in which the addressed byte is 
the low-order byte. When BE is set (big endian 
mode), the low-order three bits of all load and 
store addresses are complemented, then 
masked to the appropriate boundary for align- 
ment. This causes the addressed byte to be the 
most significant byte. Section 2.3 discusses little 
and big endian addressing. 


e OF (Overflow Flag) is set by adds, addu, subs, 
and subu when integer overflow occurs. For 
adds and subs, OF is set if the carry from bit 31 
is different than the carry from bit 30. For addu, 
OF is set if there is a carry from bit 31. For subu, 

OF is Set if there is no carry from bit 31. Under all 
other conditions, it is cleared by these instruc- 
tions. OF controls ‘the function of the intovr 
instruction. : 


2.2.5 DATA BREAKPOINT REGISTER | 


The data breakpoint register (db) is used to gener- 
ate a trap when the i860 microprocessor makes a 
data-operand access to the address stored in this 
register. The trap is enabled by BR and BW in psr. 
The db register can only be changed from supervi- 
sor level. When comparing, a number of low order 
bits of the address are ignored, depending on the 
size of the operand. For example, a 16-bit access 


ignores the low-order bit of the address when com- 
paring to db; a 32-bit access ignores the low-order. 


two bits. This ensures that any access that overlaps 

_ the address contained in the register will generate a 
trap. The DAT occurs before the data is accessed 

and prevents the load or store from completing. 


2.2.6 DIRECTORY BASE REGISTER 


The directory base register dirbase (shown in Figure 
2.6) controls address translation, caching, and bus 
_ options. The dirbase register can only be changed 
from supervisor level. The BL bit is changed from 
user level with the lock and unlock instructions. 


e ATE (Address Translation Enable), when set, en- 
ables the virtual-address translation algorithm. 
The data cache must be flushed before changing 
the ATE bit. 


e DPS (DRAM Page Size) controls how many bits 
to ignore when comparing the current bus-cycle 
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address with the previous bus-cycle address to. 
generate the NENE# signal. This feature allows 


. for higher speeds when using static column or 


page-mode DRAMs and consecutive reads and 
writes access the row. The comparison ignores 
the low-order 12 + DPS bits. A value of zero is 


appropriate for one bank of 256K x< nm RAMs, 1 


for 1M < nm RAMS, etc. For interleaved memory, 
increase DPS by one for each power of interleav- 
ing—add one for 2-way, and two for 4-way, ete. 


When BL (Bus Lock) is set, external bus access- 


es are locked. The LOCK # signal is asserted the 


next bus cycle whose internal bus request is gen- 
erated after BL is set. It remains set on every 
subsequent bus cycle as long as BL remains set. 
The LOCK# signal is deasserted on the next 
load or store instruction after BL is cleared. Traps 
immediately ciear BL. The lock and uniock 
instructions control the BL bit. The result of modi- 
fying BL with the st.c instruction is not defined. 


ITI (l-Cache, TLB Invalidate), when set in the val- 
ue that is loaded into dirbase, causes all entries 
in the instruction cache and address-translation 
cache (TLB) to be invalidated. The ITI bit does 
not remain set in dirbase. ITi always appears as 
zero when reading dirbase. Section 2.5 discuss- 
es flushing the data cache before invalidating the 


—-TLB. 


When CS8 (Code Size 8-Bit) is set, instruction 
cache misses are processed as 8-bit bus cycles. 
When. this bit is clear, instruction cache misses 
are processed as 64-bit bus cycles. This bit can 
not be set by software; hardware sets this bit at 
initialization time. It can be cleared by software 


(one time only) to allow the system to execute out 


of 64-bit memory after bootstrapping from 8-bit 
EPROM. A nondelayed branch to code in 64-bit 
memory should directly follow the st.c (store con- 
trol register) instruction that clears CS8, in order 
to make the transition from 8-bit to 64-bit memory 
occur at the correct time. The branch must be 
aligned on a 64-bit boundary. | 


RB (Replacement Block) identifies the cache 
block to be replaced by cache replacement algo- 
rithms. The high-order bit of RB is ignored by the 
instruction and data caches. RB conditions the 
cache flush instruction flush, which is discussed 
in Section 8. Table 2.3 explains the values of RB. 


RC (Replacement Control) controls cache re- 
placement algorithms. Table 2.4 explains the sig- 
nificance of the values of RC. 


DTB (Directory Table Base) contains the high-or- 
der 20 bits of the physical address of the page 
directory when address translation is enabled (i.e. 
ATE = 1). The low-order 12 bits of the address. 
are zeros. | 
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Figure 2.7. Floating-Point Status Register 


Table 2.3. Values of RB 2.2.7 FAULT INSTRUCTION REGISTER 


Replace Replace Instruction 
TLB Biock and Data Cache Block 
idest the address of the Id.c instruction; in dual-in- 


0 | 0 
1 | 
2 0 
3 1 
struction mode, the address of its floating-point com- 


Table 2.4. Values of RC panion (address of the Id.c — 4) is saved. 7 


Meanin 


Selects the normal replacement 

algorithm where any block in the set The floating-point status register (fsr) contains the 
may be replaced on cache misses in all floating-point trap and rounding-mode status for the 
caches. current process. Figure 2.7 shows its format. The fsr 


iA oe is writable in user level. 
Instruction, data, and TLB cache 
misses replace the block selected by e If FZ (Flush Zero) is clear and underflow occurs, 


RB. The instruction and data caches a result-exception trap is generated. When FZ is 
ignore the high-order bit of RB. This set and underflow occurs, the result is set to zero, 


mode is used for instruction cache and and no trap due to underflow occurs. 

TLB testing. e If Tl (Trap Inexact) is clear, inexact results co not 
cause a trap. If Tl is set, inexact results cause a 
trap. The sticky inexact flag (SI) is set whenever 
an inexact result is produced, regardless of the 


setting of TI. 7 


e RM (Rounding Mode) specifies one of the four 
rounding modes defined by the IEEE standard. 


When a trap occurs, this register contains the ad- 
dress of the trapping instruction (not necessarily the 
instruction that created the conditions that required 
the trap). The fir is a read-only register. In single-in- 
struction mode, using a Id.c instruction to read the 
fir anytime except the first time after a trap saves in 


Data cache misses replace the block 
selected by the low-order bit of RB. 
Instruction and TLB caches use 
random replacement. 


Disables data cache replacement. 


Instruction and TLB caches use Given a true result b that cannot be represented 
random replacement. 
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Table 2. 5. Vaiues of RM 


Round down (toward 
Round up (toward + 
Chop (toward zero) 


by the target data type, the i860 microprocessor 
determines the two representable numbers a 
and c that most closely bracket 5 in value (a < b 
<_c). The i860 microprocessor then rounds 
(changes) 6 to a or c according to the mode se- 
lected by RM as defined in Table 2.5. Rounding 
introduces an error in the result that is less than 
one least-significant bit. 


The U-bit (Update Bit), if set in the vas that is 
loaded into fsr by a st.c instruction, enables up- 
dating of the result-status bits (AE, AA, Al, AO, 
AU, MA, MI, MO, and Mv) in the first-stage of the 
floating- point adder and multiplier pipelines. If this 
bit is clear, the result-status bits are unaffected 
by a st.c instruction; st.c ignores the correspond- 
ing bits in the value that is being loaded. A st.c 
always updates fsr bits 21..17 and 8..0 directly. 
The U-bit does not remain set; it always appears 
~ as zero when read. 


The FTE (Floating-Point Trap Enable) bit, if clear, 
disables all floating-point traps (invalid input oper- 
and, overflow, underflow, and inexact result). 


SI (Sticky Inexact) is set when the last stage re- 
sult of either the multiplier or adder is inexact (i.e. 
when either Al or MI is set). SI is “sticky” in the 
sense that it remains set until reset by software. 
Al and MI, on the other hand, can by changed by 
the subsequent floating-point instruction. 


SE (Source Exception) is set when one of the 


— 00) 


| Value | | Rounding Mode 


Round to nearest or even 


source operands of a floating-point operation is © 


invalid; it is cleared when all the input operands 
are valid. Invalid input. operands include denor- 
mals, infinities, and all NaNs (both quiet and sig- 
naling). 


When read from the fsr, the result-status bits MA, 


MI, MO, and MU (Multiplier Add-One, Inexact, 
Overflow, and Underflow, respectively) describe 
the last stage result of the multiplier. 


When read from the fsr, the result-status bits AA, 
Al, AO, AU, and AE (Adder Add-One, Inexact, 
Overflow, Underflow, and Exponent, respectively) 
describe the last stage result of the adder. The 
high-order three bits of the 11-bit exponent of the 
adder result are stored in the AE field. 


_ The Adder Add One and Multiplier Add One bits 
indicate that the absolute value of the result frac- 
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Rounding Action 


Closer to 6 of a or c; if equally - 
close, select even number 
(the one whose least 
significant bit is zero). 

a 
C 

Smaller in magnitude of a or c. 


tion grew by one least-significant bit due to 
rounding. AA and MA are not influenced by the 
sign of the result. 


After a floating-point operation in a given unit (ad- 
der or multiplier), the result-status bits of that unit 
are undefined until the point at which result ex- 
ceptions are reported. 


When written to the fsr with the U-bit set, the 
result-status bits are placed into the first stage of 
the adder and multiplier pipelines. When the 
processor executes pipelined operations, it prop- 
agates the result-status bits of a particular unit 
(muitiplier or adder) one stage for each pipelined 
floating-point operation for that unit. When they 
reach the last stage, they replace the normal re- 
sult-status bits in the fsr. When the U-bit is not 
set, resuit-status bits in the word being written to 
the fsr are ignored. 


In a floating-point dual-operation instruction (e.g. 
add-and-multiply or subtract-and-multiply), both 
the muitiplier and the adder may set exception 
bits. The result-status bits for a particular unit re- 
main set until the next operation that uses that 
unit. 


-RR (Result Register) specifies which floating- 
point register (f0-f31) was the destination regis- — 
ter when a resuit-exception trap occurs due to a 
scalar operation. 


LRP (Load Pipe Result Precision), IRP (Integer 
(Graphics) Pipe Result Precision), MRP (Multiplier 
Pipe Result Precision), and ARP (Adder Pipe Re- 
sult Precision) aid in restoring pipeline state after 
a trap or process switch. Each defines the preci- 
sion of the last stage result in the corresponding 
pipeline. One of these bits is set when the result 
in the last stage of the corresponding pipeline is 
double precision; it is cleared if the result is single 
precision. These bits cannot be changed by soft- 
ware. 


2.2.9 KR, Kl, T, AND MERGE REGISTERS 
The KR, KI, and T registers are special-purpose reg- 


isters used by the dual-operation floating-point 
instructions pfam, pfmam, pfsm, and pfmsm, 
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which initiate both an adder (A-unit) operation and a 
multiplier (M-unit) operation. The KR, KI, and T regis- 
ters can store values from one dual-operation in- 
struction and supply them as inputs to subsequent 
dual-operation instructions. (Refer to Figure 2.14.) 


The MERGE register is used only by the graphics 
instructions. The purpose of the MERGE register is 
to accumulate (or merge) the results of multiple-ad- 
dition operations that use as operands the color-in- 
tensity values from pixels or distance values from a 
Z-buffer. The accumulated results can then be 
stored in one 64-bit operation. 


Two multiple-addition instructions and an OR in- 

struction use the MERGE register. The addition in- 

structions are designed to add interpolation values 

to each color-intensity field in an array of pixels or to 
each distance value in a Z-buffer. 


Refer to the instruction descriptions in section 8 for 
more information about these registers. 


2.3 Addressing 


Memory is addressed in byte units with a paged vir- 
tual-address space of 232 bytes. Data and instruc- 
tions can be located anywhere in this address 
space. Address arithmetic is performed using 32-bit 
input values and produces 32-bit results. The low-or- 
der 32 bits of the result are used in case of overflow. 


Normally, multibyte data values are stored in memo- 
ry in little endian format, .i.e., with the least significant 
byte at the lowest memory address. As an option, 
the ordering can be dynamically selected by soft- 
ware in supervisor mode. The i860 microprocessor 
also offers big endian mode, in which the most sig- 
nificant byte of a data item is at the lowest address. 
Figure 2.8 shows the difference between the two 
storage modes. Big endian and little endian data ar- 
eas should not be mixed within a 64-bit data word. 
Illustrations of data structures in this data sheet 
show data stored in little endian mode, i.e., the low- 
order byte is at the lowest memory address. 
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Code accesses are always done with little endian 
addressing. This implies that code will appear differ- 
ently than documented here when accessed as big 
endian data. Inte! recommends that disassemblers 
running in a big endian system, convert instructions 
which have been read as data back to little endian 
form and present them in the format documented 
here. 


Page directories and page tables are also accessed 
in little endian mode, regardless of the value of the 
BE bit. 


Alignment requirements are as follows (any violation 
results in a data-access trap): 


e 128-bit values are aligned on 16-byte boundaries 
when referenced in memory (i.e. the four least 
significant address bits must be zero). 


64-bit values are aligned on 8-byte boundaries 
when referenced in memory (i.e. the three least 
significant address bits must be zero). 


32-bit values are aligned on 4-byte boundaries 
when referenced in memory (i.e. the two least 
significant address bits must be zero). 


16-bit values are aligned on 2-byte boundaries 
when referenced in memory (i.e. the least signifi- 
cant address bit must be zero). 


2.4 Virtual Addressing 


When address translation is enabled, the i860 micro- 
processor maps instruction and data virtual address- 
es into physical addresses before referencing mem- 
ory. This address transformation is compatible with 
that of the 386 microprocessor and implements the 
basic features needed for page-oriented virtual- 
memory systems and page-level protection. 


The address translation is optional. Address transla-. 
tion is in effect only when the ATE bit of dirbase is 
set. This bit is typically set by the operating system | 
during software initialization. The ATE bit must be 
set if the operating system is to implement page-ori- 
ented protection or page-oriented virtual memory. 
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MAIN MEMORY 


GFeE 


7 
Ce] 
= . . 2 3 
ne LITTLE ENDIAN BIG ENDIAN 
o | 3 | 
c Byte Enables DATA BUS r16 Byte Enables DATA BUS r16 
Z (BE#) 63 do d31_ - do  (BE#) 
s Id.b 0(r0), r16 0 | A 7 H 
= Id.b 1(r0), r16 1 B 6 G 
S Id.b 2(r0), r16 2 C 5 1G 
m Id.b 3(r0), r16 3 D 4 E 
o. Id.b 4(r0), r16 : : ; D 
5 Id.b 5(r0), r16 : J a 
> Id.b 6(r0), r16 7 H 0 ‘ 
® Id.b 7(r0), r16_ 
® ; d31 do | 
4 id.s O(r0), r16 1:0 7:6 

Id.s 2(r0), r16 3:2 5:4 

Id.s 4(r0), r16 5:4 3:2 

Id.s 6(r0), r16 — T6 1:0 

Id. O(r0), r16 3:0 7:4 

Id.i 4(r0), r16 7:4 3:0 

NOTE: . 


64- and 128-bit big endian accesses are treated the same as little endian accesses. 
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Figure 2.9. Format of a Virtual Address 


Address translatiori is disabied when the processor 
is reset. It is enabled when a store to dirbase sets 
the ATE bit. It is disabled again when a store clears 
the ATE bit. 


2.4.1 PAGE FRAME 


A page frame is a 4-Kbyte unit of contiguous ad- 
dresses of physical main memory. Page frames be- 
gin on 4-Kbyte boundaries and are fixed in size. A 
page is the collection of data that occupies a page 
frame when that data is present in main memory. 
The data may also occupy some location in second- 
ary storage when there is not sufficient space in 
main memory. 


2.4.2 VIRTUAL ADDRESS 


A virtual address refers indirectly to a physical ad- 
dress by specifying a page table, a page within that 


DIR | Pace | OFFSET | 


PAGE DIRECTORY 


pt DIR ENTRY 


_1860T™™ MICROPROCESSOR 


Figure 2.10. Address Transiation 


6-19 


PAGE TABLE 


>I PG TBL ENTRY f— 


PRELIMINARY 


OFFSET 


table, and an ofiset withir: thai page. Figure 2.9 
shows the format of a virtual address. 


Figure 2.10 shows how the i860 microprocessor 
converts the DIR, PAGE, and OFFSET fields of a 
virtual address into the physical address by consult- 
ing two leveis of page tables. The addressing mech- 
anism uses the DIR field as an index into a page 
directory, uses the PAGE field as an index into the 
page table determined by the page directory, and 

uses the OFFSET field to address a byte within the 
page determined by the page table. 


2.4.3 PAGE TABLES 
A page table is simply an array of 32-bit page specifi- 
ers. A page table is itself a page, and therefore con- 


tains 4 Kbytes of memory or at most 1K 32-bit en- 
tries. | 


PAGE FRAME 


PHYSICAL 
ADDRESS 
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Two levels of tables are used to address a page of 


memory. At the higher level is a page directory. The 
page directory addresses up to 1K page tables of 
the second level. A page table of the second level 
addresses up to 1K pages. All the tables addressed 
~ by one page directory, therefore, can address 1M 
pages (229). Because each page contains 4 Kbytes 
(212 bytes), the tables of one page directory can 
span the entire physical address space of the i860 
microprocessor (220 x 212 = 232), 


The physical address of the current page directory is 
stored in DTB field of the dirbase register. Memory 
management software has the option of using one 
page directory for all processes, one page directory 
for each process, or some combination of the two. 


2.4.4 PAGE-TABLE ENTRIES 


Page-table entries (PTEs) in either level of page ta- 
bles have the same format. Figure 2.11 illustrates 
this format. 


"2.4.4.1 Page Frame Address 


The page frame address specifies the physical start- 
ing address of a page. Because pages are located 
on 4K boundaries, the low-order 12 bits are always 
zero. In a page directory, the page frame address is 
the address of a page table. In a second-level page 
table, the page frame address is the address of the 


page frame that contains the desired memory oper- 


and. 


2.4.4.2 Present Bit 


- The P (present) bit indicates whether a page table 
entry can be used in address translation. P = 1 indi- 


PRESENT 
WRITABLE 

USER 
WRITE=THROUGH 
CACHE DISABLE 
ACCESSED 
DIRTY 
(RESERVED) 


NOTE: 
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cates that the entry can be used. When P = 0 in 
either level of page tables, the entry is not valid for 
address translation, and the rest of the entry is avail- 
able for software use; none of the other bits in the 
entry is tested by the hardware. If P = 0 in either 


_ level of page tables when an attempt is made to use 


a page-table entry for address translation, the proc- 
essor signals either a data-access fault or an in- 
struction-access fault. In software systems that sup- 
port paged virtual memory, the trap handler can 
bring the required page into physical memory. 


Note that there is no P bit for the page directory 
itself. The page directory may be not-present while 
the associated process is suspended, but the oper- 
ating system must ensure that the page directory 
indicated by the dirbase image associated with the 
process is present in physical Memon before the 
process is Biepetenee: 


2.4.4.3 Writable and User Bits 


The W (writable) and U (user) bits are used for page- 
level protection, which the i860 microprocessor per- 


- forms at the same time as address translation. The 
concept of privilege for pages is implemented by é as- 


signing each page to one of two levels: - 


1. Supervisor level (U = 0)—for the operating sys- 
tem and other sysioms software and related data. 


2. User level (U = 1)—-tor applications procedures 
and data. | 


The U bit of the psr indicates whether the i860 mi- 
croprocessor is executing at user or supervisor level. 
The i860 micreplecessot maintains the U bit of psr 
as follows: . 


AVAILABLE FOR SYSTEMS PROGRAMMER ~~ I 
31 12 9 7 5 3 
: ciw 
. PAGE FRAME ADDRESS 31..12 AVAIL x x}o]a]e|lulwle 


X indicates Intel reserved. Do not use. 
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e The i860 microprocessor clears the psr U bit to 
indicate supervisor level when a trap occurs (in- 
cluding when the trap instruction causes the 
trap). The prior value of U is copied into PU. 


The i860 microprocessor copies the psr PU bit 
into the U bit when an indirect branch is executed 
and one of the trap bits is set. If PU was one, the 
i860 microprocessor enters user level. 


With the U bit of psr and the W and U bits of the 
page table entries, the i860 microprocessor imple- 
ments the following protection rules: 


e When at user levei, a read or write of a supervi- 
sor-level page causes a trap. 


e When at user level, a write to a page whose W bit 
is clear causes a trap. 


e When at user level, st.c to certain control regis- 
ters is ignored. 


When the i860 microprocessor is executing at super- 
visor level, all pages are addressable, but, when it is 
executing at user level, only pages that belong to the 
user-level are addressable. 


When the i860 microprocessor is executing at super- 
visor level, all pages are readable. Whether a page 
is writable depends upon the write- “protection mode 
controlled by WP of epsr: 


WP = 0 
WP = 1 


All pages are writable. 


A write to a page whose W bit is 
clear causes a trap. 


When the i860 microprocessor is executing at user 
level, only pages that belong to user level and are 
marked writable are actually writable; pages that be- 
long to supervisor level are neither readable nor wri- 
table from user level. 


2.4.4.4 Write-Through Bit 


The i860 microprocessor does not implement a 
write-through caching policy for the on-chip data 
cache; however, the WT (write-through) bit in the 
second-level page-table entry does determine inter- 
nal caching policy. If WT is set in a PTE, on-chip 
caching of data from the corresponding page is in- 
hibited. The i860 CPU may place pages having 
WT = 1 into the instruction cache. Future imple- 
mentations of the i860 architecture may adhere to a 
write-through data caching policy. Therefore, they 
may cache pages having the WT bit of the PTE set. 
If WT is clear, the normal write-back policy is applied 
to data from the page in the on-chip caches. The WT 
bit of page directory entries is not referenced by the 
processor, but is reserved. 


The WT bit is independent of the CD bit; therefore, 
data may be placed in a second-level coherent 
cache, but kept out of the on-chip caches. 
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2.4.4.5 Cache Disable Bit 


If the CD (cache disable) bit in the second-level 
page-table entry is set, data from the associated 
page is not placed in instruction or data caches. 
Clearing CD permits the cache hardware to place 
data from the associated page into caches. The CD 
bit of page directory entries is not referenced by the 
processor, but is reserved. 


To control external caches, the i860 microprocessor 
outputs on its PTB pin either the CD or WT bit. The 
PBM bit of epsr determines which bit is output. 


2.4.4.6 Accessed and Dirty Bits 


The A (accessed) and D (dirty) bits provide data 
about page usage in both levels of the page tables. 


The i860 microprocessor sets the corresponding ac- 
cessed bits in both levels of page tables before a 
read or write operation to a page. The processor 


tests the dirty bit in the second-level page table be- 


fore a write to an address covered by that page table 
entry, and, under certain conditions, causes traps. 
The trap handier then has the opportunity to main- 
tain appropriate values in the dirty bits. The dirty bit 
in directory entries is not tested by the i860 micro- 
processor. The precise algorithm for using these bits 
is specified in Section 2.4.5. 


An operating system that supports paged virtual 
memory can use these bits to determine what pages 
to eliminate from physical memory when the de- 
mand for memory exceeds the physical memory 
available. The D and A bits in the PTE (page-table 
entry) are normally initialized to zero by the operat- | 


ing system. The processor sets the A bit when a ' | | 


page is accessed either by a read or write operation. 
When a data- or instruction-access fault occurs, the 
trap handler sets the D bit if an allowable write is 
being performed, then re-executes the instruction. 


The operating system is responsible for coordinating 
its updates to the accessed and dirty bits with up- 
dates by the CPU and by other processors that may 
share the page tables. The i860 microprocessor au- 
tomatically asserts the LOCK# signal while setting 
the A bit. if an A-bit of a PTE is found not set during 
a locked sequence (created by the lock instruction), | 
a trap will occur and the processor will not update 
the A-bit. 


2.4.4.7 Combining Protection of Both Levels of 
Page Tables 


For any one page, the protection attributes of its 
page directory entry may differ from those of its 
page table entry. The i860 microprocessor com- 
putes the effective protection attributes for a page 
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by examining the protection attributes in both the 


directory and the page table. Table 2.6 shows the 


effective protection provided by the possible eo: 
_ Nations of protection attributes. 


2.4.5 ADDRESS TRANSLATION ALGORITHM 


The algorithm below defines the translation of each 
virtual address to a physical address. Let DIR, 
PAGE, and OFFSET be the fields of the virtual ad- 
dress; let PFA1 and PFA2 be the page frame ad- 
dress fields of the first and second level page tables 
respectively; DTB is the page directory table base 
address stored in the dirbase register. 


1. Read the PTE (page table entry) at the physical 
address formed by DTB:DIR:00. 


2. If Pin the PTE is zero, generate a data- or instruc- 
tion-access fault. 

3. If W in the PTE is zero, the operation is a write, 
and either the U-bit of the PSR is set or WP = 1, 
generate a data or instruction access fault. 

4. If the U-bit in the PTE is zero and the U-bit in the 
psr is set, generate a data or instruction access 
fault. 

5. If A in the PTE is zero, and if the TLB miss oc- 

curred while the bus was locked, generate a 
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- data or instruction access fault. (The trap allows 

_ software to set A to one and restart the se- 
quence. This avoids ambiguity in determining 
what address corresponds to a: locked sema- 
phore for external bus hardware use.) — 


6. If A in the PTE is zero, and if the TLB miss oc- 
curred while the bus was not locked, assert 
LOCK #. Re-fetch and check the PTE, set A, and 
store the PTE. Deassert LOCK # during the store. 


7. Locate the PTE at the. physical address formed by 
PFA1:PAGE:00. 


8. Perform the P, W, U, and A checks as in steps 2 
through 6 with the second-level PTE. 


9. If D in the PTE is clear and the operation is a 
_ write, generate a data-or instruction access fault. 


10. Form the physical address as PFA2:OFFSET. 


The i860 microprocessor looks only in external 
memory for Page Directories and Page Tables, in 
the translation process. The data cache is not 
searched. Therefore, any code which modifies Page 
Directories or Page Tables must keep them out of 
the cache. The tables should be kept in non-cache- 
able memory, or flushed from the cache. 


Table 2.6. Combining Directory and Page Protections | 


Page Directory — 
- Entry 


Page Table 
Entry | 


0 , 0 

0) | 1 

0 | 1; 0 ‘| 
0 . 1 

1 | 0 

1 ee 

1 | 0 

1 1 

0 0 

0 { | 

) | oO | 

0 1 

1 0 

1 | 1 

1 — 0 

1 1 


NOTES: 
_ N = No access allowed 


R = Read access only X = Don’t care 


Combined Protection | 


User Supervisor 
Access Access 


N 
N 
N 
N 
N 
N 
N 
N 


SzSDjDUDUVV] 
= = 


2525 DuuD 
|= = 


R/W = Both reads and writes allowed 
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The i860 microprocessor expects Page Directories 
and Page Tables to be in little endian format. The 
operating system must maintain these tables in little 
endian format by either setting BE = O when manip- 
ulating the tables or by complementing bit 2 of the 
address when loading or storing entries. 


2.4.66 ADDRESS TRANSLATION FAULTS 


The address translation fault is one instance of the 
data-access fault. The instruction causing the fault 
can be re-executed upon returning from the trap 
handler. ; 


2.4.7 PAGE TRANSLATION CACHE 


For greatest efficiency in address translation, the 
i860 microprocessor stores the most recently used 
page-table data in an on-chip cache called the TLB 
(translation lookaside buffer). Only if the necessary 
paging information is not in the cache must both lev- 
els of page tables be referenced. 


2.5 Caching and Cache Flushing 


The i860 microprocessor has the ability to cache in- 
struction, data, and address-translation information 
in on-chip caches. Caching uses virtual-address 
tags. The effects of mapping two different virtual ad- 
dresses in the same address space to the same 
physical address are undefined. 


instruction, data, and address-translation caching on 
the i860 microprocessor are not transparent. Be- 
cause the data cache uses a write-back protocol, 
writes do not immediately update memory, and 
writes to memory by other bus devices do not up- 
date the cache. Changes to page tables do not auto- 
matically update the TLB, and changes to instruc- 
tions do not automatically update the instruction 
cache. Under certain circumstances, such as I/O 
references, seif-modifying code, page-table up- 
dates, or shared data in a multiprocessing system, it 
is necessary to bypass or to flush the caches. The 
i860 microprocessor provides the following methods 
for doing this: 


e Bypassing Instruction and Data Caches. If 
deasserted during cache-miss processing, the 
KEN # pin disables instruction and data caching 
of the referenced data. If the CD bit of the associ- 
ated second-level PTE is set, caching of data and 
instructions is disabied: The i860 CPU may place 
pages having WT = 1 into the instruction cache. 
Future implementations of the i860 architecture 


may adhere to a write-through data cache policy. - 


Thus, they may cache pages having the WT bit of 
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the PTE set. The value of the CD bit or the WT bit | 
is Output on the PTB pin for use by external 
caches. 


Invalidating Instruction and Address-Transla- 
tion Caches. Storing to the dirbase register with 
the ITI bit set invalidates the contents of the in- 
struction and address-translation caches. This bit 


~ should be set when modifying a page table, when 


modifying a page containing instructions, or when 
changing the DTB field of dirbase or the WP bit 
of the epsr. Note that in order to make the in- 
struction or address-translation caches consist- 
ent with the data cache, the data cache must be 
flushed before invalidating the other caches. 


NOTE: 
The mapping of the page containing the 
currently executing instruction and the 
next six instructions should not be differ- 
ent in the new page tables when st.c dir- 
base changes DTB or activates ITI. The 
six instructions following the st.c should 
be nops and should lie in the same page 
as the st.c. 


Flushing the Data Cache. The data cache is 
_ flushed by a software routine using the flush in- 
struction. The data cache must be flushed prior to 
_ invalidating the instruction or address-translation 


caches (as controlled by the ITI bit of dirbase) or 
enabling or disabling address translation (via the 
ATE bit). The data cache does not need flushing 
if the program is modifying only the P, U, W, A, or 
D bits of a PTE (as long as the Page Frame Ad- 
dress is not changed and the PTE itself was not 
in the data cache.) The i860 CPU does not check 
these protection bits on cache line writeback. 
Thus, a trap handler can service a DAT for D-bit- 
zero by setting D = 1 and then ITI = 1. In the 
case of setting the P or A bits active, there is no 
need to invalidate or flush any caches because 
the processor does not load entries into the TLB 
that have P = OorA = O. The i860 microproces- 
sor searches oniy externai memory for Page Di- 
rectories and Page Tables in the translation pro- 
cess. The data cache is not searched. Therefore, 
Page Tables and Directories should be kept in 
non-cacheable memory, or flushed from the 
cache by any code which accesses them. 
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2.6 Instruction Set 


Table 2.7 shows the complete set of instructions 
grouped by function within processing unit. Refer to 


Section 8 for an algorithmic definition of each in- 


struction. 


The architecture of the i860 microprocessor uses 
parallelism to increase the rate at which operations 
may be introduced into the unit. Parallelism in the 
i860 microprocessor is not transparent; rather, pro- 
grammers have complete control over parallelism 
and therefore can achieve maximum performance 
for a variety of computational problems. 


2.6.1 PIPELINED AND SCALAR OPERATIONS 


One type of parallelism used within the floating-point 
unit is “pipelining”. The pipelined architecture treats 
each operation as a series of more primitive opera- 
tions (called “‘stages’’) that can be executed in par- 
allel. Consider just the floating-point adder unit as an 
example. Let A represent the operation of the adder. 
Let the stages be represented by Ay, Ag, and Ag. 
The stages are designed such that Aj+ 4 for one ad- 
der instruction can execute in parallel with A; for the 
next adder instruction. Furthermore, each Aj can be 
executed in just one clock. The pipelining within the 
multiplier and graphics units can be described simi- 
larly, except that the number of stages may be differ- 
ent. 


Figure 2.12 illustrates three-stage pipelining as 
found in the floating-point adder (also in the floating- 
point multiplier when single-precision input operands 
are employed). The columns of the figure represent 
the three stages of the pipeline. Each stage holds 
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In the i860 microprocessor, the number of pipeline 


stages ranges from one to three. A pipelined opera- 
tion with a three-stage pipeline stores the result of 


the third prior operation. A pipelined operation with a 


two-stage pipeline stores the result of the second 
prior operation. A pipelined operation with a one- 
stage pipeline stores the result of the prior opera- 
tion. : 


There are four floating-point pipelines: one for the 
multiplier, one for the adder, one for the graphics 
unit, and one for floating-point loads. The adder 
pipeline has three stages. The number of stages in 
the multiplier pipeline depends on the precision of 
the source operands in the pipeline. Single precision 
has three stages and double precision has two 
stages. The graphics unit has one stage for all preci- 
sions. The load pipeline has three stages for all pre- 
cisions. ——- 


Changing the FZ (flush zero), RM (rounding mode), 
or RR (result register) bits of fsr while there are re- 
sults in either the multiplier or adder pipeline produc- 


_ es effects that are not defined. 


2.6.1.1 Scalar Mode 


In addition to the pipelined execution mode, the i860 
microprocessor also can execute floating-point in- 


‘structions in “scalar” mode: Most floating-point in- 
- structions have both pipelined and scalar variants, 


intermediate results and also (when introduced into _ 


first stage by software) holds status information per- 
taining to those results. The figure assumes that the 


instruction stream consists of a series of consecu- | 


tive floating-point instructions, all of one type (i.e. all 
adder instructions or all single-precision multiplier in- 
structions). The instructions are represented as i, 
i+ 1, etc. The rows of the figure represent the states 
of the unit at successive clock cycles. Each time a 
pipelined operation is performed, the result of the 
last stage of the pipeline is stored in the destination 
register fdest, the pipeline is advanced one stage, 
and the input operands fsrc? and fsrc2 are trans- 
ferred to the first stage of the pipeline. 
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distinguished by a bit in the instruction encoding. In 
scalar mode, the floating-point unit does not start a 
new operation until the previous floating-point oper- 
ation is completed. The scalar operation passes 
through all stages of its pipeline before a new opera- 
tion is introduced, and the result is stored automati- 
cally. Scalar mode is used when the next operation 
depends on results from the previous few floating- 
point operations (or when the compiler or program- 
mer does not want to deal with pipelining). 


2.6.1.2 Pipelining Status Information 


Result status information in the fsr consists of the 
AA, Al, AO, AU, and AE bits, in the case of the ad- 
der, and the MA, MI, MO, and MU bits, in the case of 
the multiplier. This information arrives at the fsr via 
the pipeline in one of two ways: 
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Table 2.7. Instruction Set 


[Mnemonic [ Description 
Load and Store Instructions 


Load integer 
Store integer 

F-P load 

Pipelined F-P load 
F-P store 
Pixel store 


Register to Register Moves 
Transfer integer to F-P register 
Integer Arithmetic Instructions 


Add unsigned 
Add signed 

Subtract unsigned 
Subtract signed 


Shift Instructions | 


Shift left — 

Shift right 

Shift right arithmetic 
Shift right double 


Logical Instructions 


and Logical AND 


andh Logical AND high 
andnot Logical AND NOT 
andnoth Logical AND NOT high 
or Logical OR 

orh Logical OR high 

xor Logical exclusive OR 


xorh Logical exclusive OR high 


Control-Transfer Instructions 


trap Software trap 


intovr Software trap on integer overflow 
br Branch direct 

bri Branch indirect 

bce Branch on CC 

be.t Branch on CC taken 
bne Branch on not CC 

bne.t Branch on not CC taken 
bte Branch if equal 

btne Branch if not equal 
bla Branch on LCC and add 
call Subroutine call 

calli Indirect subroutine call 


System Control Instructions | 


flush Cache flush 

Id.c Load from control register 
st.c Store to control register 
lock Begin interlocked sequence 
unlock End interlocked sequence 


6-25 


Floating-Point Unit 


fmul.p F-P multiply 

pfmul.p Pipelined F-P multiply 
pfmul3.dd | 3-Stage pipelined F-P multiply 
fmlow.p F-P multiply low 

frcp.p F-P reciprocal 

frsqr.p F-P reciprocal square root 


F-P Adder Instructions 


fadd.p F-P add 

pfadd.p Pipelined F-P add 

famov.r F-P adder move 

pfamov.r Pipelined F-P adder move 

fsub.p F-P subtract 

pfsub.p Pipelined F-P subtract 

pfgt.p Pipelined F-P greater-than compare 
pfeq.p Pipelined F-P equal compare 

fix.p F-P to integer conversion 

pfix.p Pipelined F-P to integer conversion 
ftrunc.p F-P to integer truncation 

pftrunc.p Pipelined F-P to integer truncation 


Dual-Operation Instructions 


Pipelined F-P add and multiply 
Pipelined F-P subtract and multiply 
Pipelined F-P multiply with add 

Pipelined F-P multiply with subtract 


Long Integer Instructions 


fisub.z Long-integer subtract 
pfisub.z Pipelined long-integer subtract 
fiadd.z Long-integer add 

pfiadd.z Pipelined long-integer add 


Graphics Instructions 


fzchks 16-bit Z-buffer check 

pfzchks Pipelined 16-bit Z-buffer check - 
fzchkl 32-bit Z-buffer check | 
pfzchkl Pipelined 32-bit Z-buffer check — 
faddp Add with pixel merge 

pfaddp Pipelined add with pixel merge 
faddz Add with Z merge 
pfaddz Pipelined add with Z merge 

form OR with MERGE register 

pform Pipelined OR with MERGE register 


Assembler Pseudo-Operations 
PMinemonic [Description 


mov Integer register-register move | 
fmov.r F-P reg-reg move 

pfmov.r Pipelined F-P reg-reg move 
nop Core no-operation 

fnop F-P no-operation 


pfle.p Pipelined F-P less-than or equal 


‘ 
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Figure 2.12. Pipelined instruction Execution 


It is calculated by the last stage of the pipeline. 
This is the normal case. 


It is propagated from the first stage of the pipe- 
line. This method is used when restoring the state 
of the pipeline after a preemption. When a store 
instruction updates the fsr and the value of the 
U bit in the word being written into the fsr is set, 
the store updates the result status bits in the first 
stage of both the adder and multiplier pipelines. 
When software changes the result-status bits of 
the first stage of a particular unit (multiplier or ad- 
der), the updated resuit-status bits are propagat- 
ed one stage for each pipelined floating-point op- 
eration for that unit. In this case, each stage of the 
adder and multiplier pipelines holds its own copy 
of the relevant bits of the fsr. When they reach 
the last stage, they override the normal result- 
status bits computed from the last stage result. 
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At the next floating-point instruction (or at certain 
core instructions), after the result reaches the last 
stage, the i860 microprocessor traps if any of the 
status bits of the fsr indicate exceptions. Note that 
the instruction that creates the exceptional condition 
is not the instruction at which the trap occurs. 


2.6.1.3 Precision in the Pipelines 


In pipelined mode, when a floating-point operation is 
initiated, the result of an earlier pipelined floating- 
point operation is returned. The result precision of 
the current instruction applies to the operation being 
initiated. The precision of the value stored in fdest is 


that which was specified by the instruction that initia- 


ted that operation. | 
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Figure 2.13. Dual-instruction Mode Transitions 


If dest is the same as fsrc7 or fsrc2, the value being 
stored in fdest is used as the input operand. In this 


case, the precision of ‘dest must be the same as the. 


source precision. 


The multiplier pipeline has two stages when the 
source operand is double-precision and three stages 
when the precision of the source operand is single. 
This means that a pipelined multiplier operation 
stores the result of the second previous multiplier 
operation for double-precision inputs and third previ- 
ous for single-precision inputs (except when chang- 
ing precisions). 


2.6.1.4 Transition between Scalar and Pipelined 
Operations 


When a scalar operation is executed, it passes 
through all stages of the pipeline; therefore, any un- 
stored results in the affected pipeline are lost. To 
avoid losing information, the last pipelined opera- 
tions before a scalar operation should be dummy 
pipelined operations that unload unstored results 
from the affected pipeline. es 


After a scalar operation, the values of all pipeline 
stages of the affected unit (except the last) are un- 
defined. No spurious result-exception traps resuit 
when the undefined values are subsequently stored 
by pipelined operations; however, the values should 
not be referenced as source operands. 


For best performance a scalar operation should not 
immediately precede a pipelined operation whose 
fdest is nonzero. 


2.6.2 DUAL-INSTRUCTION MODE 


Another form of parallelism results from the fact that 
the i860 microprocessor can execute both a floating- 
point and a core instruction simultaneously. Such 
parallel execution is called dual-instruction mode. 
When executing in duai-instruction mode, the in- 
struction sequence consists of 64-bit aligned instruc- 
tions with a floating-point instruction in the lower 32 
bits and a core instruction in the upper 32 bits. Table 
2.7 identifies which instructions are executed by the 
core unit and which by the floating-point unit. 
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Programmers specify dual-instruction mode either 
by including in the mnemonic of a floating-point in- 
struction a d. prefix or by using the Assembler direc- 
tives .dual ....enddual. Both of the specifications 
cause the D-bit of floating-point instructions to be 
set. If the i860 microprocessor is executing in single- 
instruction mode and encounters a floating-point in- 
struction with the D-bit set, one more 32-bit instruc- 
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tion is executed before dual-mode execution begins. - 


If the i860 microprocessor is executing in dual-in- 
struction mode and a floating-point instruction is en- 
countered with a clear D-bit, then one more pair of 
instructions is executed before resuming single-in- 
struction mode. Figure 2.13 illustrates two variations 
of this sequence of events: one for extended se- 


quences of dual-instructions and one for a single in- 


struction pair. 7 . 


When a 64-bit dual-instruction pair sequentially fol- - 


lows a delayed branch instruction in dual-instruction 
mode, both 32-bit instructions are executed. 


2.6.3 DUAL-OPERATION INSTRUCTIONS 


Special dual-operation floating-point instructions 
(add-and-multiply, subtract-and-multiply) use both 
the multiplier and adder units within the floating- 
point unit in parallel to efficiently execute such com- 
mon tasks as evaluating systems of linear equa- 
tions, performing the Fast Fourier Transform (FFT), 
and performing graphics transformations. 


The instructions pfam fsrc7, fsrc2, fdest (add and 
multiply), pfsm fsrc7, fsrc2, fdest (subtract and mul- 
tiply), pfmam fser7, fsrc2, fdest (multiply and add), 
and pfmsm fsrc7, fsrc2, fdest (multiply and subtract). 
initiate both an adder operation and a multiplier op- 
eration. Six operands are required, but the instruc- 
tion format specifies only three operands; therefore, 
there are special provisions for specifying the oper- 
ands. These special provisions consist of: 


e Three special registers (KR, KI, and T), that can 


store values from one dual-operation instruction — 


and supply them as inputs to subsequent dual- 
operation instructions. . 


1. The constant registers KR and KI can store the 
value of fsrc7 and subsequently supply that 
value to the multiplier pipeline in place of fsrc7. 


result of the multiplier pipeline and subse- 
quently supply that value to the adder pipeline 
in place of fsrc7. 7 


(DPC) that specifies the operands and loading of 
the special registers. 


1. Operand-1 of the multiplier can be KR, KI, or 
fsrct, 


2. Operand-2 of the multiplier can be fsrc2 or the 
last stage result of the adder pipeline. 


. The transfer register T can store the last stage . 


A four-bit data-path control field in the opcode 
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3. Operand-1 of the adder can be /src7, the 
T-register, or the last stage result of the adder 
pipeline. | : 


4. Operand-2 of the adder can be fsrc2, the last 
stage result of the multiplier pipeline, or the 
last stage result of the adder pipeline. 

Figure 2.14 shows all the possible data paths sur- 
rounding the adder and multiplier. A DPC field in 


these instructions select different data paths. Sec- 
tion 8 shows the various encodings of the DPC field. 


SRC1 SRC2 RDEST 


OP1 
MULTIPLIER UNIT | 


RESULT 


ADDER UNIT 


RESULT 
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Figure 2.14. Dual-Operation Data Paths 


Note that the mnemonics pfam.p, pfsm.p, 
pfmam.p, and pfmsm.p are never used as such in 
the assembly language; these mnemonics are used 
here to designate classes of related instructions. 
Each value of DPC has a unique mnemonic associ- 
ated with it. 


2.7 Addressing Modes 


Data access is limited to load and store instructions. 
Memory addresses are computed from two fields of 
load and store instructions: /src7 and /src2._, 


1. isre7 either contains the identifier of a 32-bit inte- 
ger register or contains an immediate 16-bit ad- 
_ dress offset. | . #% 


2..isrc2 always specifies a register. 
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Table 2.8, Types of Traps 


|___ Indication Caused by 
PSREPSR| FSR | Condition | instruction 


Instruction OF Software traps trap, intovr 
Fault IL Missing unlock Any 


Floating Floating-point source exception | Any M- or A-unit except fmlow 


Point Floating-point result exception | Any M- or A-unit except fmliow, pfgt, 
Fault overflow and pfeq. Reported on any F-P 
underflow instruction plus pst, fst, and 


inexact result sometimes fid, pfid, ixfr 


Instruction Address translation exception 
Access Fault during instruction fetch 


Data Access Load/store address translation | Any load/store 
Fault exception 
DAT™ Misaligned operand address Any load/store 
| Operand address matches Any load/store 
db register 


Interrupt iN External interrupt 
|Reset _Notrapbitsset Hardware RESET signal | 
NOTES: 


*These cases can be distinguistied by examining the operand addresses. 
The IL bit of the epsr must be checked by the trap handler to tell if the bus is currently in a locked sequence. 


Because either /src7 or /src2 may be null (zero), a cute a special program known as a trap handler. 


variety of useful addressing modes result: Traps are divided into the types shown in Table 2.8. 
offset + register Useful for accessing fields within _—‘'Nterrupts and traps start execution in single instruc- 
. a record, where eee points tion mode at virtual address OxFFFFFFOO in supervi- 

to the beginning of the record. sor level (U = 0). 


Useful for accessing items in a 
stack frame, where register is 2.8.1 TRAP HANDLER INVOCATION 


r3, the register used for pointin 
to the BSSIRr In of the ee This section applies to traps other than reset. When 


frame. a trap occurs, execution of the current instruction is § 
aborted. The instruction is restartable. The proces- | 
sor takes the following steps while transferring con- 
trol to the trap handler: 


1. Copies U (user mode) of the psr into PU (previous 


register + register Useful for two-dimensional ar- 
rays or for array access within 
the stack frame. 


register Useful as the end result of any U). 
arbitrary address calculation. | 
i 2. Copies IM (interrupt mode) into PIM (previous IM). 
offset Absolute address into the first or 
last 32K of the logical address 3. Sets U to zero (supervisor mode). 
space. 4. Sets IM to zero (interrupts disabled). 
5. If the processor is in dual instruction mode, it sets 


In addition, the floating-point load and store instruc- DIM: otherwise it clears DIM. 
tions may select autoincrement addressing. In this aaa th 
mode /src2 is replaced by the sum of /src? and isrc2 _8. If the processor is in single-instruction mode and 


after performing the load or store. This mode makes the next instruction will be executed in dual- 
stepping through arrays more efficient, because it instruction mode or if the processor is in dual-in- 
eliminates one address-calculation instruction. struction mode and the next instruction will be 


executed in single-instruction mode, DS is set; 
otherwise, it is cleared. : 


2.8 Traps and Interrupts 7. The appropriate trap type bits in psr are set (IT, 
IN, !AT, DAT, FT). Several bits may be set if the 
corresponding trap conditions occur simulta- 
neously. 


Traps are caused by exceptional conditions detect- 
ed in programs or by external interrupts. Traps 
cause interruption of normal program flow to exe- 
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. An address is placed in the fault instruction regis- 
ter (fir) to help locate the trapped instruction. In 
single-instruction mode, the address in fir is the 

_ address of the trapped instruction itself. In dual-in- 
struction mode, the address in fir is that of the 


i860™ MICROPROCESSOR 


floating-point half of the dual instruction. If an in-. 
struction or data access fault occurred, the asso- © 


ciated core instruction is the high-order half of the 
dual instruction (fir + 4). In dual-instruction 
mode, when a data access fault occurs in the ab- 
sence of other trap conditions, the floating-point 
half of the dual instruction will Meneeey have been 
executed. 


The processor begins executing the trap handler 
by transferring execution to virtual address 
OxFFFFFFOO. The trap handler begins execution in 
single-instruction mode. The trap handler must ex- 
amine the trap-type bits in psr (IT, IN, IAT, DAT, FT) 
to determine the cause or causes of the trap. 


2.8.2 INSTRUCTION FAULT 


This fault is caused by any of the following condi- 
tions. In all cases the processor sets the IT bit be- 
fore entering the trap handler. 


1. By the trap instruction. When trap is executed in 
dual-instruction mode, the floating-point compan- 

~ ion of the trap instruction is not executed before 
the trap is taken. 


. By the intovr instruction. The trap occurs only if 
OF in epsr is set when intovr is executed. The 
trap handler should clear OF before returning. 
When intovr causes a trap in dual-instruction 
mode, the floating-point companion of the intovr 
instruction is completely executed before the trap 
is taken. 


. By violation of lock/unlock protocol, explained be- 
low. (Note that trap and intovr should not be 
used within a locked sequence; otherwise, it 
would be difficult to distinguish between this and 
the prior cases.) 


The lock protocol requires the following sequence 
of activities:. 


1. lock 

2. Any load or store instruction that misses the 
cache 

3. unlock | 

4. Any load or store instruction (regardless of 


whether it misses the cache) 


There may be other instructions between any of 
these steps. The bus is locked after step 2, and re- 
mains locked until step 4. Step 4 must follow step 1 
_by 30 instructions or less, otherwise the instruction 
‘trap occurs. In case of a trap, IL is also set. If the 
load or store instruction in step 2 hits the cache, the 
sequence is legal, but the bus is not locked. 
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2.8.3 FLOATING-POINT FAULT 


The floating-point fault is reported on floating-point 
instructions, pst, fst, and sometimes fid, pfld, ixfr. 
The floating-point faults of the i860 microprocessor 
support the floating-point exceptions defined by the 
IEEE standard as well as some other useful classes 
of exceptions. The i860 microprocessor divides 
these into two classes: source exceptions and result 
exceptions. The numerics library supplied by Intel 
provides the IEEE standard default handling for all 
these exceptions. | 


2.8.3.1 Source Exception Faults 


When used as inputs to the multiplier or adder, all. 
exceptional operands, including infinities, denormal- 
ized numbers and NaNs, cause a floating-point fault 
and set SE in the fsr. Source exceptions are report- 
ed on the instruction that initiates the operation. For 
pipelined operations, the pipeline is not advanced. 


The SE value is undefined for faults on fld, pfld, fst, 
pst, and ixfr instructions when in single-instruction 


~mode or when in dual-instruction mode and the com- 


panion instruction is nota multiplier or adder opera- 


tion. 
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2.8.3.2 Result Exception Faults 


The class of result exceptions includes any of the 
foliowing conditions: 


e Overflow. The absolute value of the rounded 
true result would exceed the largest positive finite 
number in the destination format. 


Underfiow (when FZ is clear). The absolute vail- 
ue of the rounded true result would be smaller 
than the smallest positive finite number in the 
destination format. 


inexact result (when TI is set). The result is not 
exactly representable in the destination format. 
For example, the fraction 1/4 cannot be precisely 
represented in binary form. This exception occurs 
frequently and indicates that some (generally ac- 
ceptable) accuracy has been lost. 


The point at which a result exception is Seneca de- 
pends upon whether pipelined operations are being 
used: 


® Scalar (nonpipelined) operations. Result ex- 
ceptions are reported on the next floating-point, 
fst.x, or pst.x (and sometimes fid, pfld, ixfr) in- 
struction after the scalar operation. When a trap 
occurs, the last stage of the affected unit con- 
tains the result of the scalar operation. 


Pipelined operations. Result exceptions are re- 
ported when the result is in the last stage and the 
next floating-point, fst.x or pst.x (and sometimes 
fid, pfld, ixfr) instruction is executed. When a 
trap occurs, the pipeline is not advanced, and the 
last stage results (that caused the trap) remain 
unchanged. | 
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When no trap occurs (either because FTE is clear or 
because no exception occurred), the pipeline is ad- 
vanced normally by the new floating-point operation. 


The resuit-status bits of the affected unit are unde- 
fined until the point that result exceptions are report- 
ed. At this point, the last stage result-status bits (bits 
29..22 and 16..9 of the fsr) reflect the values in the 
last stages of both the adder and multiplier. For ex- 
ample, if the last stage result in the multiplier has 
overflowed and a pipelined floating-point pfadd is 
started, a trap occurs and MO is set. 


For scalar operations, the RR bits of fsr specify the 
register in which the result was stored. RR is updat- 
ed when the scalar instruction is initiated. The trap, 
however, occurs on a subsequent instruction. Pro- 
grammers must prevent intervening stores to fsr 
from modifying the RR bits. Prevention may take one 
of the following forms: 


e Before any store to fsr when a result exception 
may be pending, execute a dummy floating-point 
operation to trigger the result-exception trap. 


e Always read from fsr before storing to it, and 
mask updates so that the RR bits are not 
changed. 


For pipelined operations, RR is cleared and the re- 
sult is in the last stage of the pipeline of the appro- 
priate unit. The trap handler must flush the pipeline, 
saving the results and the status bits. 


In either pipelined or scalar mode, the trap handler 
must then compute the trapping result. In either 
case, the result has the same fraction as the true 
result and has an exponent which is the low-order 
bits of the true result. The trap handler can inspect 
the result, compute the result appropriate for that 
instruction (a NaN or an infinity, for example), and 
store the correct result. The result is either stored in 
the register specified by RR (if nonzero) or (if RR = 
0) the trap handler must reload the pipeline with the 
saved results and status bits. 


Result exceptions may be reported for both the ad- 
der and multiplier units at the same time. In this 
case, the trap handler should fix up the last stage of 
both pipelines. 


2.8.4 INSTRUCTION ACCESS FAULT 


This trap occurs during address translation for in- 
struction fetches in any of these cases: 


e The address fetched is in a page whose P (pres- 
ent) bit in the page table is clear (not present). 


e The address fetched is in a supervisor mode 
page, but the processor is in user mode. 


e The address fetched is in a page whose PTE has 
A = 0, and the access occurs during a locked 
sequence (i.e., between lock and unlock). 
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Note that several instructions are fetched at one 
time, either due to instruction prefetching or to in- 
struction caching. Therefore, a trap handler can 
change from supervisor to user mode and continue 
to execute instructions fetched from a supervisor 
page. An instruction access trap occurs only when 
the next group of instructions is fetched from a su- 
pervisor page (up to eight instructions later). If, in the 
meantime, the handler branches to a user page, no 
instruction access trap occurs. No protection viola- 
tion results, because the processor does not permit 
data accesses to supervisor pages while running in 
user mode. 


2.8.5 DATA ACCESS FAULT 


This trap results from an abnormal condition detect- 
ed during data operand fetch or store. Such an ex- 
ception can be due only to one of the following caus- 
es: 


e An attempt is being made to write to a page 
whose D (Dirty) bit is clear. 


A memory operand is misaligned (is not located 
at an address that is a multiple of the length of 
the data). 


The address stored in the db register is equal to 
one of the addresses spanned by the operand. 


The operand is in a not-present page. 


An attempt is being made from user level to write 
to a read-only page or to access a supervisor-lev- 
el page. 

The operand was in a page whose PTE had A = 
0, and the access occurred during a locked se- 
quence. (i.e., between lock and unlock.) 


Write protection (determined by epsr bit WP = 1) | 
is violated in supervisor mode. 


2.8.6 INTERRUPT TRAP 


An interrupt is an event that is signaled from an ex- 
ternal source. If the processor is executing with in- 
terrupts enabled (IM set in the psr), the processor. 
sets the interrupt bit IN in the psr, and generates an 
interrupt trap. Vectored interrupts are implemented 
by interrupt controllers and software. 


2.8.7 RESET TRAP 


When the i860 microprocessor is reset, execution 
begins in single-instruction mode at physical ad- 
dress OxFFFFFFOO. This is the same address as for 
other traps. The reset trap can be distinguished from 
other traps by the fact that no trap bits are set. The 
instruction cache is flushed. The bits DPS, BL, and 
ATE in dirbase are cleared. CS8 is initialized by the 
value at the INT pin at the end of reset. The read- 
only fields of the espr are set to identify the proces- 
sor, while the IL, WP, and PBM bits are cleared. The 


bits U, IM, BR, and BW in psr are cleared, as are the 
trap bits FT, DAT, IAT, IN, and IT. All other bits of 
psr and all other register contents are undefined. 


Refer to Table 2.9 for a summary of these initial set- 
tings. 
Table 2.9. Register and Cache Values after Reset 


integer Registers | Undefined 
|Floating-Point =| Undefined 

Registers | 
psr 


U, IM, BR, BW, FT, DAT, IAT, IN, 

IT = 0; others are undefined 

IL, WP, PBM, BE = 0; 
Processor Type, Stepping 
Number, DCS are read 
only; others are undefined 

Undefined 

DPS, BL, ATE = 0; others 

are undefined 

Undefined 

| Undefined 

Undefined 


initial Value 


Undefined 
Flushed 


The software must ensure that the data cache is 
flushed and control registers are properly initialized 
before performing operations that depend on the 
values of the cache or registers. The data cache has 
no “validity” bits, so memory accesses before the 
flush may result in false data cache hits. 


Reset code must initialize the floating-point pipeline 
state to zero with floating-point traps disabled to en- 
sure that no spurious floating-point traps are gener- 
ated. 


After a RESET the i860 microprocessor starts exe- 
cution at supervisor level (U=0). Before branching 
to the first user-level instruction, the RESET trap 
handler or subsequent initialization code has to set 
PU and a trap bit so that an indirect branch instruc- 
tion will copy PU to U, thereby changing to user level. 


2.9 Debugging 


The i860 microprocessor supports debugging with 
both data and instruction breakpoints. The features 
of the i860 architecture that support debugging in- 
clude: | 


@ db (data breakpoint register) which permits speci- 
fication of a data addresses that the i860 micro- 
processor will monitor. 
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BR (break read) and BW (break write) bits of the 
psr, which enable trapping of either reads or 
writes (respectively) to the address in db. 


DAT (data access trap) bit of the psr, which al- 
lows the trap handler to determine when a data 
breakpoint was the cause of the trap. 


trap instruction that can be used to set break- 
points in code. Any number of code breakpoints 
can be set. The values of the /src7 and isrc2 
fields help identify which breakpoint has oc- 
curred. , 


IT (instruction trap) bit of the. psr, which allows 
the trap handler to determine when a trap 
instruction was the cause of the trap. 


3.0 HARDWARE INTERFACE 


In the following description of hardware interface, 
the # symbol at the end of a signal name indicates 
that the active or asserted state occurs when the 
signal is at a low voltage. When no. # is present after 
the signal name, the signal is asserted when at the 
high voltage level. 


3.1 Signal Description 


Table 3.1 identifies functional groupings of the pins, 
lists every pin by its identifier, gives a brief descrip- 
tion of its function, and lists some of its characteris- 
tics. All output pins are tristate, except HLDA and 
BREQ. All inputs are synchronous, except HOLD 
and INT. 


3.1.1 CLOCK (CLK) 


The CLK input determines execution rate and timing 
of the i860 microprocessor. Timing of other signals 
is specified relative to the rising edge of this signal. 
The i860 microprocessor can utilize a clock rate of © 
33.3 MHz or 40 MHz. The internal operating frequen- 
cy is the same as the external clock. 


3.1.2 SYSTEM RESET (RESET) 


Asserting RESET for at least 16 CLK periods causes 
initialization of the i860 microprocessor. Refer to 
section 3.2 “Initialization” for more details related to 
RESET. Oo 


3.1.3 BUS HOLD (HOLD) AND BUS HOLD 
- ACKNOWLEDGE (HLDA) 


These pins are used for i860 microprocessor bus 
arbitration. At some clock after the HOLD signal is 
asserted, the i860 microprocessor releases control 
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Table 3.1. Pin Summary 


Function 


Active 
State 


Input/ 
Output 


Execution Control Pins 


CLK 
RESET 


CLock 
System reset 
Bus hold 


HOLD 
HLDA 
BREQ 
INT/CS8 


Bus request 


Bus hold acknowledge 


Interrupt, code-size 


Bus Interface Pins 


A31-A3 
BE7 # -BEO# 
D63-—D0 
LOCK # 
W/R # 
NENE # 
NA# 
READY # 
ADS # 


Address bus 
Byte Enables 
Data bus 
Bus lock 


NExt NEar 


ADdress Status 


Write/Read bus cycle 


Next Address request 
Transfer Acknowledge 


High 
Low 
High 
Low 
High/Low 


O--O00500 


Cache Interface Pins _ 


KEN # Cache ENable 
PTB Page Table Bit - 


Testability Pins 


SHI Boundary Scan Shift Input High 
BSCN Boundary Scan Enable High | 
SCAN Shift Scan Path High | | 


| Power and Ground Pins _ 


_ “System power 
System ground 


Voc 
Vss 


A # after a pin name indicates that the signal is active when at the low voltage level. 


of the local bus and puts all bus interface outputs 
(except BREQ and HLDA) into a floating state, then 
asserts HLDA—all during the same clock period. It 
maintains this state until HOLD is deasserted. In- 
struction execution stops only if required instructions 
or data cannot be read from the on-chip instruction 
and data caches. | 


The time required to acknowledge a hold request is 
one clock plus the number of clocks needed to finish 
any outstanding bus cycles. HOLD is recognized 
even while RESET or LOCK # are asserted. 


When leaving a bus hold, the i860 microprocessor 
deactivates HLDA and, in the same clock period, ini- 
tiates a pending bus cycle, if any. 


Hold is an asynchronous input. 
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3.1.4 BUS REQUEST (BREQ) 


This signal is asserted when the i860 microproces- 
sor has a pending memory request, even when 
HLDA is asserted. This allows an external bus arbi- 
ter to implement an ‘on demand only” policy for 
granting the bus to the i860 microprocessor. BREQ 
is asserted the clock after the i860 microprocessor 
realizes an internal request for the bus. In normal 
operation, BREQ goes low the clock after ADS# 
goes low for the final pending bus cycle. (Refer to 
Figure 4.10 for timing information.) During data or 
instuction cache fills, however, BREQ may be deas- 
serted for one or more clocks, due to cache and TLB 
logic. 


3.1.5 INTERRUPT/CODE-SIZE (INT/CS8) 


This input allows interruption of the current instruc- 
tion stream. If interrupts are enabled (IM set in psr) 
when INT is asserted, the i860 microprocessor 
fetches the next instruction from address 


intel 


OxFFFFFFOO. To assure that an interrupt is recog- 
~ nized, INT should remain asserted until the software 
acknowledges the interrupt (by writing, for example, 
to a memory-mapped port of an interrupt controller). 
When the bus is not locked, the maximum time be- 
tween the assertion of INT and the execution of the 
first instruction of the trap handler is ten clocks, plus 
the time for four sets of four pipelined read cycles 
and two sets of four pipelined writes (instruction- 
_ and data-cache misses and write-back cycles to up- 
date memory), plus the time for twenty nonpipelined 
read cycles (six TLB misses, with eight refetches 
when the A-bit is zero), plus the time for eight non- 
pipelined writes (updates to the A-bit). 


eT TRAP EN TTT ATEN HDA OTIE EBS EOIN HO CIENTS MOTTA SE 


If the bus is locked from a lock instruction, the INT 
pin is ignored and the INT bit of epsr is always zero. 
The lock instruction can only assert LOCK # for 30- 
33 instructions before trapping. 


lf INT is asserted during the clock before the falling 
edge of RESET, the eight-bit code-size mode is se- 
lected. For more about this mode, refer to section 
3.2 “Initialization”. 


INT is an asynchronous input. 


3.1.6 ADDRESS PINS (A31~A3) AND BYTE 
ENABLES (BE7 # -BE0#) 


The 29-bit address bus (A31—A3) identifies address- 
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The address and byte-enable pins are driven until 
either NA# or READY # is asserted. 


3.1.7 DATA PINS (D63-D0) 


The bus interface has 64 bidirectional data pins 
(D63-—D0) to transfer data in eight- to 64-bit quanti- 
ties. Pins D7~D0 transfer the least significant byte; 
pins D63-D56 transfer the most significant byte. 


In read bus cycles, all 64 bits of the data bus are 
latched, even in CS8-mode instruction fetches when 


only the low-order eight bits are used. 


In write bus cycles, the point at which data is driven 
onto the bus depends on the type of the preceding 
cycle. If there was no preceding cycle (i.e. the bus 
was idle), data is driven with the address. If the pre- 
ceding cycle was a write, data is driven as soon as 
READY # is returned from the previous cycle. If the 
preceding cycle was a read, data is driven one clock 
after READY # is returned from the previous cycle, 
thereby allowing time for the bus to be turned 
around. Data continues to be driven until READY # 


-_ for the current cycie is returned. 


es to a 64-bit location. Separate byte-enable signals — 


(BE7 #-BE0O#) identify which bytes should be ac- 
cessed within the 64-bit location. In all noncachea- 
ble read cycles (KEN# deasserted), the byte 
enables match the length and address of the re- 
quested data. Cacheable read cycles (KEN# assert- 
ed), however, result in four 64-bit memory cycles to 
fill an entire 32-byte cache line. The BEn# pins acti- 
vated are those that represent the operand of the 
load instruction that caused the line fill, and these 
same BEn# pins remain activated for all four cycles 
of the line fill. All 64 bits must be returned for each 
cycle without regard for the BEn# signals. In all 
write cycles (noncacheable writes as well as cache 
line write-backs) the BEn# signals indicate the 
bytes that must be written. 


Instruction fetches (W/R# is low) are distinguished 
from data accesses by the unique combinations of 
BE7 #-—BE0O# defined in Table 3.2. For an eight-bit 
code fetch in eight-bit code-size (CS8) mode, 
BE2#~-BE0O# are redefined to be A2—A0 of the ad- 
dress. In this case BE7#-BE3# form the code 
shown in Table 3.2 that identifies an instruction 
fetch. The A2 in the table does not represent a phys- 
ical pin, just a conceptual internal address line value. 
The “x” under A2 for CS8 mode means “not applica- 
ble’, or “don’t care’. All other combinations of byte 
enables indicate data accesses. 
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3.1.8 BUS LOCK (LOCK #) 


This signal is used to provide atomic (indivisible) 
read-modify-write sequences in multiprocessor sys- 
tems. A multiprocessor bus arbiter must permit only 
one processor a locked access to the address which 
is on the bus when LOCK # first activates. The sys- 
tem must maintain the lock of that location until 
LOCK# deactivates. | 


The i860 microprocessor coordinates the external 
LOCK # signal with the software-controlled BL bit of 
the dirbase register. Programmers do not have to 
be concerned about the fact that bus activity is not 
always synchronous with instruction execution. 
LOCK # is asserted with ADS# for the address op- 
erand of the first load or store instruction executed 
after the BL bit is set by the lock instruction. Pend- 
ing bus cycles are locked according to the value of 
the BL bit when the instruction was executed. Even 
if the BL bit is changed between the time that an 
instruction generates an internal bus request and 
the time that the cycle appears on the bus, the i860 
microprocessor still asserts LOCK # for that bus cy- 
cle. . 


If ADS# is active when LOCK# deactivates, then 
that request should complete before the hardware 
relinquishes the lock. If ADS # is not active, the lock- 
ing of the location can immediately end when 
LOCK # deactivates. Of course the simplest arbitra- 
tion hardware can just lock the entire bus against all 
other accesses during LOCK # assertion. 
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Table 3.2. Identifying Instruction Fetches 


Normal 0 | oe Seaman = eaemeee 
(Non-CS8) 1 a 1 | 0 | 
1 
1 


Normal 
(Non-csa) | * ae ee 
, ee Lae 1 0 Oo | 1 | — Low-order address bits 


Cee RAA RI rah we mene 


When the BL bit is deasserted with the unlock in- plement pipelining, NA# does not have to be acti- 
struction, LOCK # is deasserted with the next load vated.) The i860 microprocessor samples NA# ev- 
or store but after any pending bus cycles. Between ery clock, starting one clock after the prior activation 
locked sequences, at least one cycle of no LOCK# of ADS #. When NA# is active, the i860 microproc- 
is guaranteed by the behavior of the unlock instruc- essor is free to drive address and bus-cycle defini- 
tion. LOCK # deassertion may occur independently tion for the next pending bus cycle. The i860 micro- 
of ADS# for the case of a trap or a cache hit after processor remembers that NA# was asserted when 


unlock. no internal request is pending; therefore, NA# can 
: _ be deactivated after the next rising edge of the CLK 

The i860 microprocessor also asserts LOCK# dur- signal. Up to three bus cycles can be outstandin 

ing TLB miss processing for updates of the ac- simultaneously. : 


cessed bit in page-table entries. The maximum time 


that LOCK # can be asserted in this case is five 3.1.12 TRANSFER ACKNOWLEDGE (READY #) 
clocks plus the time required to perform a read-mod- 


ify-write sequence. Instruction fetches do not alter The system must assert the READY # signal during 
the LOCK # pin. read cycles when valid data is on the data pins and 
during write cycles when the system has accepted 
data from the data pins. READY # must be asserted 
ignored and the INT bit of epsr is zero when read by for at least one clock. Sampling of READY # begins 
Id.c epsr. The time that interrupts are disabled is in the clock after an ADS# or in the second clock 
limited by the lock protocol outlined in Section 2.8.2. after a prior READY #. 


Between lock and unlock instructions, the INT pin is 


3.1.9 WRITE/READ BUS CYCLE (W/R#) 3.1.13 ADDRESS STATUS (ADS#) 


This pin specifies whether a bus cycle is a read 
(LOW) or write (HIGH) cycle. It is driven until either 
NA# or READY # is asserted. 


The i860 microprocessor asserts ADS# during the 
first clock of each bus cycle to identify the clock 
period during which it begins to assert outputs on 
_ the address bus. This signal is held active for one 
3.1.10 NEXT NEAR (NENE #) clock. | 


This signal allows higher-speed reads and writes in 

the case of consecutive reads and writes that ac- : 3-1-14 CACHE ENABLE (KEN#) 

cass static column or page-mode DRAMs. The i860 The i860 microprocessor samples KEN# to deter- 
microprocessor asserts NENE# when the current = mine whether the data being read for the current 
address is in the same DRAM page as the previous — cache-miss cycle is to be cached. This pin is inter- 
bus cycle. The i860 microprocessor determines the nally NORed with the CD and WT bits to control 


Eee Pode eee) nepec wn Were lee cacheability on a page by page basis (refer to Table 
dirbase register. The page size can range from 29 to 3.3). on page by pag ( 


216 64-bit words, supporting DRAM sizes from 256K 
x 1, 256K x 4, and up. NENE# is never asserted If the address is one that is permitted to be in the 


on the next bus cycle after HLDA is deasserted. cache, KEN# must be continuously asserted during 
the sampling period starting from the second rising 

3.1.11 NEXT ADDRESS REQUEST (NA#) clock edge after ADS# is asserted, through the 
on ust clock NA# or READY # is asserted. The entire 64 

NA# makes address pipelining possible. The sys- _ bits of the data bus will be used for the read, regard- 
tem asserts NA# for at least one clock to indicate less of the state of the byte-enable pins. Three addi- 


that it is ready to accept the next address from the _ tional 64-bit bus cycles will be generated to fill the 
i860 microprocessor. NA# may be asserted before rest of the 32-byte cache block. 
the current cycle ends. (If the system does not im- . 
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If KEN# is found deasserted at any clock from the 
clock after ADS # through the clock of the first NA# 
or READY #, the data being read will not be cached 
and two scenarios can occur: 1) if the cycle is due to 
data-cache miss, no subsequent cache-fill cycles 
will be generated; 2) if the cycle is due to an instruc- 
tion-cache miss, additional cycle(s) will be generat- 
ed until the address reaches a 32-byte boundary. To 
avoid caching a line, external hardware must deas- 
sert KEN# during or before the first NA# or 
READY #. 


3.1.15 PAGE TABLE BIT (PTB) 


Depending on the setting of the PBM (page-table bit 
mode) bit of the epsr, the PTB reflects the value of 
either the CD (cache disable) bit or the WT (write 
through) bit of the page-table entry used for the cur- 
rent cycle. When paging is 3 disabled, PTB remains 
inactive. 


Table 3.3. Cacheability based on 
KEN# and CD OR WT 


cDORWT | KENS 


Meaning 


Cacheable access 
-Noncacheable access 
Noncacheable page 
Noncacheable page 


3.1.16 BOUNDARY SCAN SHIFT INPUT (SHI) 


This pin is used with the testability features. Refer to" 


-. section 3.3. 
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3.1.17 BOUNDARY SCAN ENABLE (BSCN) 

This pin is used with the eee features. Refer to 
section 3.3. 

3.1.18 SHIFT SCAN PATH (SCAN) 

This pin is used with the testability features. Refer to 
section 3.3. 

3.1.19 CONFIGURATION (CC1-CC0) 

These two pins are reserved by Intel. Strap both pins 
LOW. 


3.1.20 SYSTEM POWER (Vcc) AND GROUND 
(Vss) 


The i860 microprocessor has 48 pins for power and 


ground. All pins must be connected to the appropri- 
ate low-inductance power and ground signals in the 
system. 


3.2 Initialization 


Initialization of the i860 microprocessor is caused by 
assertion of the RESET signal for at least 16 clocks. 
Table 3.4 shows the status of output pins during the 


time that RESET is asserted. Note that HOLD re- 


quests are honored during RESET and that the 
status of output pins depends on whether a HOLD 
request is being acknowledged. 
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Table 3.4. Output Pin Status during Reset 


Pin Value 
Pin Name 


lWw/R#,PTB. | LOW __| Tri-State OFF 
ow] 


A31-A3, 
BE7#-—-BEO#, Undefined 
NENE# 


After a reset, the i860 microprocessor begins exe- 
cuting at physical address OxFFFFFFOO. The pro- 
gram-visible state of the i860 microprocessor after 
reset is detailed in section 2.8.7. 


HOLD dete 
Not Acknowledged 
Acknowledged 9 


[tow | _1ow 
HLDA HIGH 


Tri-State OFF 


Eight-bit code-size mode is selected when INT/CS8 
is asserted during the clock.before the falling edge 
of RESET. While in eight-bit code-size mode, in- 
struction cache misses are byte reads (transferred 
on D7-DO of the data bus) instead of eight-byte 
reads. This allows the i860 microprocessor to be 
bootstrapped from an eight-bit EPROM. For these 
code reads, byte enables BE2#-BE0O# are rede- 
fined to be the low order three bits of the address, 
so that a complete byte address is available. These 
reads update the instruction cache if KEN# is as- 
serted (refer to section 3.1.14) and are not pipelined 
even if NA# is asserted. While in this mode, instruc- 
tions must reside in an eight-bit wide memory, while 
data must reside in a separate 64-bit wide memory. 
After the code has been loaded into 64-bit memory, 
initialization code can initiate 64-bit code fetches by 
clearing the CS8 bit of the dirbase register (refer to 
section 2). Once eight-bit code-size mode is dis- 
abled by software, it cannot be reenabled except by 
resetting the i860 microprocessor. 


3.3 Testability 


The i860 microprocessor has a boundary scan mode 
that may be used in component- or board-level test- 
ing to test the signal traces leading to and from the 
i860 microprocessor. Boundary scan mode provides 
a simple serial interface that makes it possible to 
test all signal traces with only a few probes. Probes 
need be connected only to CLK, BSCN, SCAN, SHI, 
BREQ, RESET, and HOLD. 


The pins BSCN and SCAN control the boundary 
scan mode (refer to Table 3.5). When BSCN is as- 
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serted, the i860 microprocessor enters boundary 
scan mode on the next rising clock edge. Boundary 
scan mode can be activated even while RESET is 
active. When BSCN is deasserted while in boundary 
scan mode, the i860 microprocessor leaves bounda- 
ry scan mode on the next rising clock edge. After 
leaving boundary scan mode, the internal state is 
undefined; therefore, RESET should be asserted. 


Table 3.5. Test Mode Selection 
Testability Mode 


No testability mode selected 
(Reserved for Intel) 


Boundary scan mode, normal 
Boundary scan mode, shift 
SHI as input; BREQ as 
output 


For testing purposes, each signal pin has associated 
with it an internal latch. Table 3.6 indentifies these 
latches by name and classifies them as input, out- 
put, or control. The input and output latches carry 
the name of the corresponding pins. 


Table 3.6. Test Mode Latches 


Associated 
Control 
Latch 


SHI 
BSCN 
SCAN 
RESET 
DO-D63 
CC1-CCO 


READY # . 
KEN # 
NA# 
INT/CS8 
HOLD 


BE7 #-BE0O# 
BREQ 


Within boundary scan mode the i860 microproces- 
sor operates in one of two submodes: normal mode 
or shift mode, depending on the value of the SCAN 
input. A typical test sequence is... 


ite 


1. Enter shift mode to assign values to the latches 
that correspond with the pins. 


2. Enter normal mode. In normal mode the i860 mi- 
croprocessor transfers the latched values to the 
output pins and latches the values that are being 

_ driven onto the input pins. 


3. Reenter shift mode to read the new values of the 
input pins. 


3.3.1 NORMAL MODE 


When SCAN is deasserted, the normal mode is se- 
lected. For each input pin (RESET, HOLD, 
INT/CS8, NA#, READY#, KEN#, SHI, BSCN, 
SCAN, CC1, and CCO), the corresponding latch is 
loaded with the value that is being driven onto the 
pin. 


The tristate output pins (A31-A3, BE7#-BE0#, 
W/R#, NENE#, ADS#, LOCK#, and PTB) are en- 
abled by the control latches ADDRt (for A31—A3), 
BEt, W/Rt, NENEt, ADSt, LOCKt, and PTBt. If a con- 
trol latch is set, the corresponding output latches 
_ drive their output pins; eee the pins are not 
driven. 


The I/O pins (D63-—D0) are enabled by the control 
latch DATAt, which is similar to the other control 
latches. In addition, when DATAt is not set, the data 
pins are treated as input pins and their values are 
latched. 


3.3.2 SHIFT MODE 


-. When SCAN is asserted, the shift mode is selected. 
In shift mode, the pins are organized into.a boundary . 


scan chain. The scan chain is configured as a shift 
register that is shifted on the rising edge of CLK. The 
SHI pin is connected to the input of one end of the 
boundary scan chain. The value of the most signifi- 
cant bit of the scan chain is output on the BREQ pin. 
To avoid glitches while the values are being shifted 
along the chain, the tester should assert both the 
RESET and HOLD pins. Then all tristate outputs are 
disabled. The order of the pins within the chain is 
shown in Figure 3.1. 


— BSCN _ 


71 72 100 
cco — A31 - S68 A3 


106 107 ; 109 
W/Rt — W/R#€ ADS# 


15 116 | 118 
INT/CS8 BEt 


NA# — 
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A tester causes se into. as mode for one of two 
purposes: 


1. To assign values to output latches to be driven 
onto output pins upon subsequent entry into nor- 
mal mode. 


2. To read the values of input pins pean latched - 
in normal mode. | 


4.0 BUS OPERATION 


A bus cycle begins when ADS# is activated and © 
ends when READY # is sampled active. READY # is 
sampled one clock after assertion of ADS# and 
thereafter until it becomes active. New cycles can 
start as often as every other clock until three cycles 
are outstanding. A bus cycle is considered outstand- 
ing as long as READY # has not been asserted to 
terminate that cycle. After READY # becomes ac- 
tive, it is not sampled again for the following (out- 
standing) cycle until the second clock after the one 
during which it became active. READY # is assumed 
to be inactive when it is not sampled. 

With regard to how a bus cycle is generated by the 
i860 microprocessor, there are two types of cycles: 
pipelined and nonpipelined. Both types of cycles can 
be either read or write cycles. A pipelined cycle is 
one that starts while one or two other bus cycles are 
outstanding. A nonpipelined cycle is one that starts 
when no other bus cycles are outstanding. 


4.1 Pipelining 


A m-n read or write cycle is a cycle with a total cycle 
time of m clocks and a cycle-to-cycle time of n 
clocks (m = n). Total cycle time extends from the 
clock in which ADS# is activated to the clock in 


- which READY # becomes active, whereas cycle-to- 


cycle time extends from the time that READY # is 
sampled active for the previous cycle to the time 
that it is sampled active again for the current cycle. 
When m = n, anonpipelined cycle is implied; m > n 
implies a pipelined cycle. 


SCAN —> RESET —> DATAt —> Se oe 3 — 


101 402 103 104 
ADDRt NENEt —> NENE# PTBt — 


110 114 112. 113 
HLDA LOCKt — LOCK# READY# —> 


119 126 127 
BE7# vies —> BEO# BREQ — 


Figure 3.1. Order of Boundary Scan Chain 


Pipelining may occur for the next bus cycle any time 
the current bus cycle requires more than two clock 
_ periods to finish (m > 2). If a bus request is pending, 
the next cycle will be initiated wnen NA# is sampled 
active, even if the current cycle has not terminated. 
In this case, pipelining occurs. NA# is not recog- 
nized unitl after ADS # has become inactive. 


To allow high transfer rates in large memory sys- 
tems, two-level pipelining is supported (i.e., there 
may be up to three cycles in progress at one time). 
Pipelining enables a new word of data to be trans- 
ferred every two clocks, even though the total cycle 
time may be up to six clocks. 


4.2 Bus State Machine 


The operation of the bus is described in terms of a 
bus state machine using a state transition diagram. 
Figure 4.1 illustrates the i860 microprocessor bus 
state machine. A bus cycle’ is composed of two or 
more states. Each bus state lasts for one CLK peri- 
od. 
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Therefore there can be up to three outstanding cy- 
cles, and there are two possible intermediate states 
for each level of pipelining. Tj; is the next state after 
Tj, as long as j cycles are outstanding. Tja is entered 
when NA# is active but the i860 microprocessor is 
not ready to start a new cycle. 


Five conditions have to be met to start a new cycle 
while one or more cycles are already pending: 


1. READY # inactive 

2. NA# having been active 

3. An internal request pending (BREQ active) 
4. HOLD not active 

5. Fewer than three cycles outstanding 


Note that BREQ is asserted on the clock after the 


- 1860 microprocessor realizes an internal request for 


- The i860 microprocessor supports up to two levels _ 


of address pipelining. Once it has started the first 
bus cycle, it can generate up to two more cycles as 
long as READY # remains inactive. To start a new 
bus cycle while other cycles are still outstanding, 
NA# must be active for at least one clock cycle 
starting with the clock after the previous ADS#. 
_ NA# is latched internally. | 


States Tj and Tix, forj = {1,2,3} andk = {1,2}, are 
used to describe the state of the i860 microproces- 
sor Bus State Machine. Index j indicates the number 
of outstanding bus cycles while index k distinguishes 
the intermediate states for the j-th outstanding cycle. 
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the bus. 


Upon hardware RESET, the bus control logic enters 
the idle state T; and awaits an internal request for a 
bus cycle. If a bus cycle is requested while there is 
no hold request from the system, a bus cycle begins, 
advancing to state T;. On the next cycle, the state 
machine automatically advances to state 111. If 
READY # is active in state T;1, the bus control logic 
returns either to 1), if no new cycle is started, or to 
Ty, if a new cycle request is pending internally. In 
fact, if an internal bus request is pending each time 
READY # is active, the state machine continues to 
cycle between T;; and T}. 


However, if READY # is not active but the next ad- 
dress request is pending (as indicated by an active 
NA#), the state machine advances either to state 
To (if an internal bus request is pending, signifying 
that two bus cycles are now outstanding), or to state 
Tyo (if no bus internal request is pending, signifying 
NA# has been found active). Transitions from state 
Tyo are similar to those from Ty 4. 
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READY# DEASSERTED READY# DEASSERTED: 
NA# DEASSERTED 
READY# DEASSERTED- 
NA# ASSERTED 


. NA# ASSERTED- 
REQUEST PENDING- 
HOLD DEASSERTED 


READY# DEASSERTED- 
(NO REQUEST + 
HOLD ASSERTED) 


READY# DEASSERTED 
READY# ASSERTED 


READY# DEASSERTED: 
NA# ASSERTED: 
REQUEST PENDING: 
HOLD DEASSERTED 


READY# ASSERTED 


READY# DEASSERTED- 
. REQUEST PENDING- 
READY# DEASSERTED- HOLD DEASSERTED 


>) 

On, 

Or. A 
(NO REQUEST + xp G. 

HOLD ASSERTED) : READY# DEASSERTED- : 

: NA# ASSERTED- READY# DEASSERTED- 
; (NO REQUEST + NAA ASSERTED- 

HOLD ASSERTED REQUEST PENDING: 


READY# ASSERTED - 
NA# ASSERTED 
REQUEST PENDING 
HOLD DEASSERTED 


HOLD DEASSERTED 


\ 
READY# DEASSERTED- /, READY# ASSERTED 
NA# DEASSERTED 


NOTES: | 
| ee READY# Once READY # has been sampled active, it is 
HOLD DEASSERTED- | OS | not sampled again until two clocks later 
NO REO UES ns! NA# Not sampled during ADS# active clock 
| ADS# Active in T,, To and T3 
"REQUEST PENDING: HLDA _ Active in Ty 
HOLD DEASSERTED HOLD HOLD in this figure is the internally synchro- | 
nized version of the external signal HOLD 
REQUEST Internal Bus Request Pending (BREQ assert- 
ed) 


READY# ASSERTED 
NO REQUEST: | 
HOLD DEASSERTED 


“ALWAYS” 
READY# ASSERTED- 
REQUEST PENDING 
HOLD DEASSERTED 


HOLD ASSERTED 
HOLD DEASSERTED- 
NO REQUEST 


HOLD ASSERTED 
240296-29 


Figure 4.1. Bus State Machine 


If two bus cycles are already outstanding (as indicat- machine continues to oscillate between Tj; and Tj, 

ed by To, fork = {1,2}) and NA# is latched active forj = {2,3}. 

but READY # is not active, one more bus request 

causes entry into state T3. Transitions from this When NA# is sampled active while there is a pend- 

state are similar to those from To. ing bus request, ADS # is activated in the next clock 
| period (provided no more than two cycles are al- 

In general, if there is an internal bus request each ready outstanding). 

time both READY # and NA# are active, the state 
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Internal pending bus requests start new bus cycles 
only if no HOLD request has been recognized. Ty is 
entered from the idle state T), T44, and T42. HLDA is 
active in this state. There is a one clock delay to 
synchronize the HOLD input when the signal meets 
the respective minimum setup and hold time require- 
ments. The state machine uses the synchronized 
HOLD to move from state to state. 


4.3 Bus Cycles 


Figures 4.2 through 4.10 illustrate combinations of 
bus cycles. 


CYCLE 1 
NON=PIPELINED 
READ 
(22) 

Ty 14 


CLK 
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4.3.1 NONPIPELINED READ CYCLES 


A read cycle begins with the clock in which ADS # is 
asserted. The i860 microprocessor begins driving 
the address during this clock. It samples READY # 
for active state every clock after the first clock. A 
minimum of two clocks is required per cycle. Data is 
latched when READY # is found active when sam- 
pled at the end of a clock period. Figure 4.2 illus- 
trates nonpipelined read cycles with zero wait 
states. 


CYCLE 2 CYCLE 3 
NON=PIPELINED | NON=PIPELINED 
READ READ 
(2=2) (2-2) 

Ty 144 Ty 114 


steps ie: Rolebtoeetes 


BEn#, NENE#, 


NA 


3 


es [XXKL XXX) XK 
DY LTH TIT 


_READY# Zi i | /\ 
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Figure 4.2. Fastest Read Cycles 
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CYCLE 1 
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CYCLE 3 


NON=PIPELINED 
| WRITE 
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CYCLE 2 


WRITE 
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xo 
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Figure 4.3. Fastest Write Cycles 


- 4.3.2 NONPIPELINED WRITE CYCLES 


The ADS# and READY*# activity for write cycles 
follows the same logic as that for read cycles, as 
Figure 4.3 illustrates for back-to-back, nonpipelined 
write cycles with zero wait-states. 


The fastest write cycle takes only two clocks to com- 
plete. However, when a read cycle immediately pre- 
cedes a write cycle, the write cycle must contain a 


_ wait state, as illustrated in Figure 4.4. Because the 


6-42 


device being read might still be driving the data bus 
during the first clock of the write cycle, there is a 
potential for bus contention. To help avoid such con- 
tention, the i860 microprocessor does not drive the 
data bus until the second clock of the write cycle. 
The wait state is required to provide the additional 
time necessary to terminate the write cycle. In other 
read-write combinations, the i860 mulcroprecesse 
does not require a wait state. 
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CYCLE 1 
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Figure 4.4. Fastest Read/Write Cycles 
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Figure 4.5. Pipelined Read Foliowed by Pipelined Write 
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Figure 4.6. Pipelined Write Followed by Pipelined Read 


4.3.3 PIPELINED READ AND WRITE CYCLES 


Figures 4.5 and 4.6 illustrate combinations of non- 
pipelined and pipelined read and write cycles. The 
‘following description applies to both diagrams. While 
Cycle 1 is still in progress, two new cycles are initiat- 
ed. By the time READY # first becomes active, the 
state machine has moved through states Tj, 141, 
To, To1, and T3. Cycles 3 and 4 show how activating 
READY # terminates the corresponding outstanding 
cycle, and yet activating NA# while there is an inter- 
nal request pending adds a new outstanding cycle. 


In Figure 4.5, Cycle 3 is a write cycle following a read 
cycle; therefore, one wait state must be inserted. 
The i860 microprocessor does not drive the data 
bus until one clock after the read data is returned 
from the preceding read cycle. During Cycles 3 and 


and 13; maintaining full bus capacity (two levels of 
pipelining; three outstanding cycles). Cycles 2, 3, 
and 4 in Figure 4.6 are 5-2 cycles; i.e. each requires 
a total cycle time of five clocks while the throughput 
rate is one cycle every two clocks. 


Figure 4.7 illustrates in a more general manner how 
the NA# signal controls pipelining. Cycle 1 is a 2-2 
cycle, the fastest possible. The next cycle cannot be 
started any earlier; therefore, there is no need to 
activate NA# to start the next cycle early. Cycle 2, a 
3-3 read, is different. Cycle 3 can be started during 


the third state (a wait state) of Cycle 2, and NA# is 


4, the state machine oscillates between states T3 
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asserted to accomplish this. 


NA# is not activated following the ADS# clock of 
Cycle 3, thereby allowing Cycle 3 to terminate be- 
fore the start of Cycle 4. As a result, Cycle 4 is a 
nonpipelined cycle. 
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- Figure 4.7. Pipelining Driven by NA# 
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Figure 4.8. NA# Active with No Internal Bus Request 
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Figure 4.9. Locked Cycles 


When there is no internal bus request, activating 
NA# does not start a new cycle; the i860 microproc- 
essor, however, remembers that NA# has been ac- 
tivated. Figure 4.8 illustrates the situation where 
NA # is active but no internal bus request is pending. 


NA# is activated when two cycles are outstanding. — 


Because there is no internal request pending until 
after one idle state, no new bus cycle is started dur- 
ing that period. 


4.3.4 LOCKED CYCLES 


The LOCK # signal is asserted when the current bus 
cycle is to be locked with the next bus cycle. Asser- 
tion of LOCK# may be initiated by a program’s set- 
ting the BL bit of the dirbase register using the lock 
instruction (refer to section 2) or by the i860 micro- 
processor itself during page table updates. 


In Figure 4.9, the first read cycle is to be locked with 
the following write cycle. If there were idle states 
between the cycles, the LOCK# signal would re- 
main asserted. This is the case for a read/modify/ 
write operation. Cycle 3 is not locked because 
LOCK # is no longer asserted when Cycle 2 starts. 
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4.3.5 HOLD AND BREQ ARBITRATION CYCLES 


The HOLD, HLDA, and BREQ signals permit bus ar- 
bitration between the i860 microprocessor and an- 


other bus master. 


See Figure 4.10. When HOLD is asserted, the i860 
microprocessor does not relinquish control of the 
bus until all outstanding cycles are completed. If 
HOLD were asserted one clock earlier, the last i860 
microprocessor bus cycle before HLDA would not 
be started. | 


The outputs (except HLDA and | BREQ) float when 
HLDA is asserted. HOLD is sampled at the end of 
the clock in which it is activated. Recommended set- 
up and hold times must be met to guarantee sam- 
pling one clock after external HOLD activation. 
When HOLD is sampled active, a one clock delay for 
internal synchronization follows. Likewise when 
HOLD is deasserted, there is a one-clock delay for 
internal synchronization before HLDA is deasserted. 
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Figure 4.10. HOLD, HLDA, and BREQ 


lf, during a HOLD cycle, an internal bus request is SET. If INT/CS8 is sampled active, the i860 micro- 
generated, BREQ is activated even though HLDA is processor enters CS8 mode. No inputs (except for 
asserted. It remains active at least until the clock HOLD and INT/CS8) are sampled during RESET. 
after ADS# is activated for the requested cycle. 
Note that, because HOLD is recognized even while 
RESET is active, the HLDA output signal may also 
4.4 Bus States During RESET become active during RESET. Refer to Table 3.4 
“Output Pin Status during Reset’. 
Figure 4.11 shows how INT/CS8 is sampled during 
the clock period just before the falling edge of RE- 


| = 16 CLKs | 


aa 
navese RXXHRKKKK KKK IK 
ores RXKRKKKKK, KXXKRKKRK 
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Figure 4.11. Reset Activities 
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5.0 MECHANICAL DATA 


Figures 5.1 and 5.2 show the locations of pins; Tables 5.1 and 5.2 help to locate pin identifiers. 
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() 
NA# 


() 
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() 
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() 
BE1# 


() 
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Figure 5.1. Pin Configuration—View from Top Side 
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Figure 5.2. Pin Configuration—View from Pin Side 


6-49 


ntl Pa -  {860™ MICROPROCESSOR —~C PRELIMINARY 


ARSE SAS TAA CSE ENE SP LIE ASCOT SION PEO ATE EDR NER TSI DDT SEE LEONEL END LCC EE ESLER OLE NII SINE DN AAR A 


Table 5.1. Pin Cross Reference by Location = 


A LAE 
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Table 5.2. Pin Cross Reference by Pin Name 


Signal Location Signal Location Signal Location Signal Location 
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Table 5.3. Ceramic PGA Package Dimension Symbols 


Letter or 

Symbol 
Distance from seating plane to highest point of body | 
1 Distance between seating plane and base plane (lid) 
4 


Description of Dimensions 


Distance from base plane to highest point of body 
Distance from seating plane to bottom of body 
B 


2 
3 


Diameter of terminal lead pin | 
Largest overall package dimension of length _. ~ 4% 
{ Linear spacing between true lead position centerlines ; | 


L Distance from seating plane to end of lead 
1 Other body dimension, outer lead center to edge of body 
NOTES: 


1. Controlling dimension: millimeter. 

2. Dimension ‘‘e,” (“e’’) is non-cumulative. — 

3. Seating plane (standoff) is defined by P.C. board hole size: 0.0415-0.0430 inch. 
4. Dimensions “B”, “Bz” and “C” are nominal. 

5. Details of Pin 1 identifier are optional. 


A 
A body length dimension, outer lead center to outer lead center 
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Figure 5.3. 168 Lead Ceramic PGA Package Dimensions 


6.0 PACKAGE THERMAL 
SPECIFICATIONS 


For this section, let: 

maximum power consumption 

To = case temperature 

= ambient air temperature 

9ca = thermal resistance from case to ambient air 
= thermal resistance from junction to case 


thermal resistance from junction to ambient 
air 


I 


~j 
> 
| 


D 
ras 
@) 

| 


\ 
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The i860 microprocessor is specified for operation 
when Tc is within the range of 0°C-—85°C. To may be 
measured in any environment to determine whether 
the i860 microprocessor is within specified operating 
range. The case temperature should be measured at 
the center of the top surface opposite the pins. 


Ta can be calculated from 8c, (thermal resistance 
from case to ambient) with the following equation: 


Ta = To — P*Oca 


intel 


Typical values for 9ca and @jc at various airflows 
are given in Table 6.1 for the 1.75 sq. in., 168 pin, 
ceramic PGA. 6jc is also shown so that 05, can be 
calculated by: | 


e 


Oca = 95a = 95c 


Note that 6jc with a heatsink differs from @j¢ with- 
out a heatsink because case temperature is mea- 
sured differently. 


Table 6.2 shows the maximum T, allowable (without 
exceeding Tc) at various airflows and operating fre- 
, quencies (foi). | 
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Note that T, is greatly improved by attaching “fins” 
or a “heat sink’ to the package. P (the maximum 
power consumption) is calculated by using the maxi- 
mum Icc at 5V as tabulated in the DC Characteris- 
tics of section 7. , 


Figure 6.1 gives typical Icc derating with case tem- 
perature. For more information on heat sinks, mea- 
surement techniques, or package characteristics, re- 
fer to /ntel Packaging Handbook, order number 
240800. | 


Typical part at 5V with maximum load 


lec (mA) 
580 
570 
560 
550 
540 
530 
520 
510. 
500 
490 
480 
470 
460 


oof oo 
a 
—— 
Dees 
fee 


PEEL PE 


0 10 20 30 40 


SEE ge: 
tig 
eee eT ieee 


Bail 
eae 


50 660 
Te (°C) 


70 =6©80 85 


240296-33 


Figure 6.1. lcc vs Case Temperature 


Table 6.1. 0c, at Various Airflows and 0c 


In °C/Watt 


Oca with - 

Heat Sink* 

9ca without |: 15 
Heat Sink 


*Nine-fin, unidirectional heat sink (fin 


3] 9 


17.5) 13 


| Airflow-ft/min (m/sec) 


| o | 200 | 400 | 600 | 800 | 1000 
| (0) | (2.03) | (3.04) | (4.06) | (5.07) 


| 5 | a9 | 34 | 
jes fas |e 


dimensions: 0.350” height, 0.040 . 


width, 0.115” center-to-center spacing, 1.530” length). 
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Table 6.2. Maximum Ta, at Various Airflows 


Ta without 


Heat Sink 


Airflow-ft/min (m/sec) 


*Nine-fin unidirectional heat sink (fin dimensions: 0.350” height, 0.040 width, 
0.115” center-to-center spacing, 1.530” length). 


7.0 ELECTRICAL DATA 


Inputs and outputs are TTL compatible, except for 
CLK. All input and output timings are specified rela- 
tive to the 1.5 volt level of the rising edge of CLK 
and refer to the point that the signals reach 1.5V. 


7.1 Absolute Maximum Ratings | 
Case Temperature Tc under Bias ...... 0°C to 85°C 


Storage Temperature .......... — 65°C to + 150°C 
Voltage on Any Pin 
with Respect to Ground.............. —0.5 to 6.5V 


7.2 D.C. Characteristics 


NOTICE: This data sheet contains preliminary infor- 


mation on new products in production. The specifica- 
tions are subject to change without notice. 


* WARNING: Stressing the device beyond the “‘Absolute 
Maximum Riatings’”’ may cause permanent damage. 
These are stress ratings only. Operation beyond the 
“Operating Conditions” is not recommended and ex- 
tended exposure beyond the “Operating Conditions” 
may affect device reliability. 


_ Table 7.1. DC Characteristics 
To = 0°C to 85°C, Voc = 5V +5% 


Input LOW Voltage 
Input HIGH Voltage 
CLK Input LOW Voltage 
CLK Input HIGH Voltage 
Output LOW Voltage 
Output HIGH Voltage | 
Power Supply Current 
CLK = 33.3 MHz 
CLK = 40.0 MHz 
Input Leakage Current 


Output Leakage Current 
Input Capacitance 

1/O or Output Capacitance 
Clock Capacitance 


NOTES: 


Tsyabot [Parameter | win 


(Note 1) 
(Note 2) 


Vcc @5V 
Voc @5V 
No pullup 
or pulldown 


(Note 3). 
_ (Note 3) 
(Note 3) 


1. This parameter is measured at 4.0 mA for A31 -A3, D63-D0, BE7 #-BEO#; at 5.0 mA for all other outputs. 
2. This parameter is measured at 1.0 mA for A31-A3, D63-D0, BE7 #-BE0#; at 0.9 mA all other outputs. 
3. These are not tested. They are guaranteed by design characterization. 


: 
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7.3 A.C. Characteristics 


Table 7.2. A.C. Characteristics 
To = 0°C to 85°C, Voc = 5V +5% 
All eimings measured at CRS 1. 5V unless otherwise specified. 


rm) 
oe) 
= 
a |N 
< |. 


| Parameter 


| Symbol 


oe 
me 


om, 
=) 
” 

~~ 


CLK High Time: 
CLK Low Time 


t4 CLK Fall Time 
t5 CLK Rise Time 


t6a A31—A3, PTB, W/R#, NENE# 
Valid Delay 


BEn#* Valid Delay 
Float Time, All 


ADS#, BREQ, LOCK#, HLDA. 
Valid Delay 


D63-D0 Valid Delay 


Setup Time, All Inputs except 11 
DATA 


t11 Hold Time All Inputs except DATA 4 
Data Setup Time tt 


tts | DATAHold Time 4 


os 


ND 


50 pF Load 


50 pF Load 


s| 
st 


= 
oO; 


G> {Po | 
oO; oO Go 


+ ~ 
oO 


NOTES: 

1. Float condition occurs when maximum output current becomes less than ILo in magnitude. Float delay is not tested. 

2. INT and HOLD are asynchronous inputs. The setup and hold specifications are given for test purposes or to assure 
Sea ial on a specific rising edge of CLK. INT should remain asserted until software acknowledges the interrupt. 

*n = 0, 1,..., 7 
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INPUT 
SETUP 


t12,1, 1 t13min 


INPUTS 


t6max t8 max t9 max 


t6mint8 mint 2 min 


oureuts | KXXXKXXAKA Va 


: t7 max 


| FLOAT 


OUTPUTS ep) 


240296-25 


Figure 7.1. CLK, Input, and Output Timings 
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TYPICAL* OUTPUT 
DELAY (ns) 
@ 1.5V 


LOAD CAPACITANCE, C, (pf) 
NOTES: 240296-26 


Graphs are not linear outside the C, range shown. 
{ nom = nominal value given in the AC timing table. 
*Typical part under worst-case conditions. 


Figure 7.2. Typical Output Delay vs Load Capacitance under Worst-Case Conditions | 


TYPICAL* OUTPUT 
SLEW TIME (ns) 9 
(0.8 = 2.0V) 


DS#, BREQ, LOCK#, HLDA 


/R#, NENE# 


100 125 150 


LOAD CAPACITANCE, C, (pf) 
NOTES: : ieee 


Graphs are not linear outside the C.. range shown. 
*Typical part under worst-case conditions. 


240296-27 


Figure 7.3. Typical Slew Time vs Load Capacitance under Worst-Case Conditions 


lec (ma)* 


0 : 
8 12 16 20 24 26 30 34 3840 


FREQUENCY (MHz 
NOTES: Mie) 240296-28 


Graphs are not linear outside the frequency range shown. 
*Worst-case supply current at 5V. 


Figure 7.4. Typical Ioc vs Frequency 


6-58 


intel : i860™ MICROPROCESSOR PRELIMINARY 


8.0 INSTRUCTION SET 


Key to abbreviations: 


For register operands, the abbreviations that describe the operands are composed of two parts. The first part 
describes the type of register: 


Cc One of the control registers fir, psr, epsr, dirbase, db, or fsr 
f One of the floating-point registers: f0 through f31 
/ One of the integer registers: r0 through r31 


The second part identifies the field of the machine instruction into which the operand is to be placed: 


srct The first of the two source-register designators, which may be either a register or a 16-bit 
immediate constant or address offset. The immediate value is zero-extended for logical 
operations and is sign-extended for add and subtract operations uncluding addu and subu) 
and for all addressing calculations. 


srctni . Same as src7 except that no immediate constant or address offset value is pewiltied 

srcts Same as src7 except that the immediate constant is a 5-bit value that is zero-extended to 32 
| _ bits. 

src2 The second of the two source-register designators. 

dest The destination register designator. 


Thus, the operand specifier /src2, for example, means that an integer register is used and that the encoding of 
that register must be placed in the src2 field of the machine instruction. 


Other (nonregister) operands are specified by a one-part abbreviation that represents both the type of operand 
required and the instruction field into which the value of the operand is placed: 


#const .. A 16-bit immediate constant or address offset that the i860 microprocessor sign-extends to 
32 bits when computing the effective address. 

/broff A signed, 26-bit, immediate, relative branch offset. 

sbroftf A signed, 16-bit, immediate, relative branch offset. 

brx A function that computes the target address by shifting the offset (either /broff or sbroff) left 


by two bits, sign-extending it to 32 bits, and adding the result to the current instruction pointer 
plus four. The resulting target address may lie anywhere within the address space. 


Unless otherwise specified, floating-point operations accept single- or double-precision 
source operands and produce a result of equai or greater precision. Both input operands 
must have the same precision. The source and result precision are specified by a two-letter 
suffix to the mnemonic of the operation. 


Other abbreviations include: 

Precision specification .ss, .sd, or .dd (.ds not permitted). Refer to Table 8.1. 
Precision specification .ss, .sd, .ds, or .dd. Refer to Table 8-1. 

sd or .dd. Refer to Table 8-1. 

-sS or .dd. Refer to Table 8.1. 

-b (8 bits), .s (16 bits), or .I (82 bits) 

1 (32 bits), .d (64 bits), or .q (128 bits) 

Al (32 bits), or .d (64 bits) 


NZke< 4% 


Table 8.1. Precision Specification 


Source Result 
Precision | Precision 


single | single 


single _ double 
double double 
double _ single 
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mem.x(address) The contents of the memory location indicated by address with a size of x. 


PM The pixel mask, which is considered as an array of eight bits PMi7]. .PM[O}, where PMO i is 
: _ the least significant bit. 


8.1 Instruction Definitions in Alphabetical Order _ 


adds ISICTAISICZ NGOSE ecisicney cua aga ned wate on eee EX Soe oe eoee eatin Bees wee Add Signed 
idest <— isrc! + isrc2 , 
OF <— (bit 31 carry ¥ bit 30 carry) 
CC set if isrc2 < —isrc? (signed) 
CC clear if isrc2 => —/src7 (signed) 


addu ISIOT ISICL TOOST iss ic ov a’ wid age neiudalsd Oa Sealah nbd ketene seeds oh Add Unsigned 
idest <— isrc! + isrc2 
OF <— bit 31 carry 
CC < bit 31 carry “sd | 

and ISIC), ISICZ NOCSE 6 ie 5. bs GE KORA Ss BAA REESE ORES OER AA a Logical AND 
idest <— isrc? and /src2 
CC set if result is zero, cleared otherwise | 


andh #const, isrc2, idest ..... 6.0... cece eee ee se dts ioe Ghaduiees seiecies Logical AND High’ 
idest <— (#const shifted left 16 bits) and /src2. 
CC set if result is zero, cleared otherwise 


andnot —fS/C1, ISFC2, IdOSE.. o.oo cece cece e cece ccnneeccceuneeceeeuneeeeeeeennns Logical AND NOT 
idest <— not isr¢e? and isrc2 nad : 
CC set if result is zero, cleared otherwise | a | 

andnoth = #comnst, ISrc2, [deSt ... 1. ccc cece cece tenn eaes ere Logical AND NOT High 


idest <- not (#const shifted left 16 bits) and isro2 
CC set if result is zero, cleared otherwise 


be UDIOUE akoei fies ce ate Ge octal saseteoumiaaen ee acess oasenas Wine eins uD maR race Branch on CC 
IF  CC= | 
THEN continue execution at brx(/broff) 
Fl 

be.t Ibroff ....... eee A setunuleas aeons ic cratheaenonarearen nating aedeie nee . .Branch on CC, Taken 
IF CC = 1 | | | , 


THEN execute one more sequential instruction 
| continue execution at brx(/broff) . 

ELSE — skip next sequential instruction 

Fl | 


bla isrc1nl, IstC2, SBION oe ov nvadacewwes Cue end es bs Sp alerar aie Oe 6x Aw dole ates Branch on LCC and Add 
| LCC-temp clear if isrc2 < —jsrc1ni (signed) 
LCC-temp set if isrc2 = —isrc1ni (signed) 
isrc2 <— isrcini + isrc2 
Execute one more sequential instruction 
IF | LCC 
THEN LCC <— LCC-temp 
continue execution at brx(sbroff) 
ELSE LCC <~ LCC-temp | 


FI | | | | 
bne WROTE aire oink, acted ahr ate ae ga es eNO BE Branch on Not CC 
IF CC = | 
THEN continue execution at brx(/broff) 
FI : 
bne.t Ibroff 0... 6055. a2 Deiat ade teat dae a ceed hehe eee Branch on Not CC, Taken 
IF CC= ) : 


THEN execute one more sequential instruction 
continue execution at brx(/broff) 
ELSE skip next sequential instruction 
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ate nee man eran ner te nee ree a et Se An Oe he SES APA ert ee Pr Peete th rE eur SEA RA A Oe 


br NOONE haa oes reich ve ae evita ele Pe Ik sah hte aaa bane Pah, Branch Direct Unconditionally 
Execute one more sequential instruction. 
Continue execution at brx(lbroff). 


bri SIO TIM testes hibernate, Ste aieaeateass pkub a a eae Branch Indirect Unconditionally 
Execute one more sequential instruction 
IF any trap bit in psr is set 


THEN ~~ copy PU to U, PIM to IM in psr 
clear trap bits 
IF DS is set and DIM is reset 
THEN enter dual-instruction mode after executing one 
instruction in single-instruction mode 
ELSE IF DS is set and DIM is set 
THEN enter single-instruction mode after executing one 
instruction in dual-instruction mode 
ELSE IF DIM is set 
THEN enter dual-instruction mode 
for next two instructions 
ELSE enter single-instruction mode 
for next two instructions 
Fl 
FI 
Fl 
Fl | 
Continue execution at address in /src1ni 
(The original contents of isrc7n/ is used even if the next instruction 
modifies /src?ni. Does not trap if isrc7ni is misaligned.) 


bte ISTO) S, ASICZ, GOIN oo hen ce hea hdd te ORR Rage ORE he BS EAA RO EE Branch If Equal 
IF isrc1s = isrc2 
THEN continue execution at brx(sbroff 
Fl : 

btne ISICTS IGIC2 SDION 20) oG-8 ade ui ete hee a ed a WA oR heen a ea Be Branch If Not Equal 

IF isrc1s # isrc2 
THEN continue execution at brx(sbroff) 
Fi 

call VOT io cia tee OM ine kG woe ie ORE Ras hee hehe Kee ne eae aes Subroutine Call 


ri <~ address of next sequential instruction + 4 (+8 in dual mode) 
Execute one more sequential instruction 
Continue execution at brx(/broff) 


calli VSI OI IN sora eo hace ie. cata hed an narnia ated ee REA AG a Indirect Subroutine Call 
ri <~— address of next sequential instruction + 4 (+8 in dual mode) 
Execute one more sequential instruction 
Continue execution at address in isrcini 
(The original contents of /src7ni is used even if the next instruction 
modifies /src7ni. Does not trap if src7n/is misaligned. 
The register isr¢7ni must not be rt.) 


fadd.p ISIC) TSIGZ TOOSE os a5, Ge eas a 4 iad eg AEG ONE SGT AG RG TERRE ES RCO Floating-Point Add 
fdest <~- fsrci + fsrc2 | | 
faddp ISIC T) TSICZ, OOS! 3 cot a ogists epee Nae ats eine Rank Ree A ones Reta Add with Pixel Merge 


fdest <— fsrc?t + fsrc2 
Shift and load MERGE register as defined in Table 8.2 


faddz - fsrcl, fsrc2, tdest..... Ne Cre ee Oe ee ee aT ee eee eee Add with Z Merge 
fdest <«— fsrct + fsrc2 . | 
Shift MERGE right 16 and load fields 31..16 ang 63..48 


famov.r — fS/C1, fdeSt.. 6. ccc cece ene e ence Bacto erent inva ina ace eee Floating-Point Adder Move 
fdest <— fsrci . | ; 
Send fsrc7 through the floating-point adder. (Preserves —0O (minus zero) when fsrc7 is —0O. fsrc2 
must be coded as f0 by the assembler.) 
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- fladd.w (SIC), ISIC2, IOOST ox ccccncare end setenes edsemamee sak eensnaenaenieces Long-iInteger Add 
fdest <— fsrc? + fsrc2 : | : - | 
fisub.w = fsrc7, fsrc2, fdest...... 6... cece eee ee ee ee Long-Integer Subtract 
fdest <— fsrc1 — fsrc2 . | 
fix.v ISIC SOCSE 55s cae ee nn clades wae baila ee eee Ree -, Floating-Point to integer Conversion 
fdest <— 64- bit value with low-order 32 bits equal to integer part of fsrc7 rounded 
| Floating-Point Load 
fid.y isrc1(isrc2), POS ices wt ea etn are ea es (Normal) 
fid.y ISICTOSICA) Fs SOOSE 6 356s BAA A ARE Se REA LEGGE Cates evi 8 Reins (Autoincrement) 


fdest <— mem.y (src? + isrc2) 
IF autoincrement 

THEN isrc2 <— isrc? + isrc2 
FI 


: Cache Fiush 
flush “SE CONSUISICE) ssid so GHA eRe AS EOS ae bn nS eh a Dy Sesleri ae TS (Normal) 
flush  FCONSHISICA + 4+ oo ccc lec cece eens Goo ae, Be acta Scie cdaniaeeh eae (Autoincrement) 
Replace block in data cache with address (#const + isrooh | 
Contents of block undefined. 
IF autoincrement 
THEN isre2 <— #const:+ isrc2 
Fl 


fmiow.dd fsrc LASICE TACSE oo x6 356304 hh Ea eae ee SoS kcumiseueduacnoane Floating-Point Multiply Low 
fdest <— low-order 53 bits of fsrc7 mantissa x< fsrc2 mantissa 
fdest bit 53 <— most significant bit of mantissa | 


fmov.r VSIOCT OOS? obi Sows pug ete ed ew asad WER ERO SERIE? Floating-Point Reg-Reg Move 
Assembler pseudo-operation | | 
fmov.ss fsrc7, fdest = fiadd.ss fsrc7, f0, fdest 
fmov.dd fsrc7, fdest = fiadd.dd fsrc7, f0, fdest 
fmov.sd /fsrc?, fdest = famov.sd fsrc7, fdest 
fmov.ds /fsrc7, fdest = famov.ds fsrc7, fdest 


fmul.p fsrc1, fsrc2, fdest ........ SG antes WL. ni tare a AD eet yer sieera ives Yale a 7 .Floating-Point Multiply 


fdest <— fsrc?t X fsrc2 
fnop........ ee eee ee eee Te eee ee re eT ee cee ere eee Floating-Point No Operation 


Assembler pseudo-operation 
fnop = shrd r0, r0, r0 


9 


form ROT 100 octehoea ve sian deneds wk auaaaanasaneort apart aoe OR with MERGE Register 
| fdest <— fsrc? OR MERGE | | 
MERGE <— 0 | | 
- frep.p ISIC2, TACSE baits t heen ie sees ie Floating-Point Reciprocal 
faest <— 1/fsrc2 with maximum mantissa error < 2-7 a 
frsqr.p fsrc2, fest oo... cece cece eee pa eee de ucnae wean Floating-Point Reciprocal Square Root 
fdest <— 1/SQRT (fsrc2) with maximum mantissa error < 2-7 | 
, Floating-Point Store 
fst.y ICSI ASIC USIIOZ): 3.iiodin hs wench eae hE KG Sete Hin BENET CRO O RSA S (Normal) 
fst.y fdest, isrcNisrc2 ++ occ. cece eee eee seat oateree atu aA eae eeeree (Autoincrement) 
— mem.y (isre2 + isrc1l) <- fdest | | 
IF autoincrement 
THEN /src2 <— isrc? + isrc2 
FI 
fsub.p fsrct, fsrc2, dest... 0... cece cue eeeeeeaecvvceeccvasesveeeeseess. Floating-Point Subtract 
fdest <— fsrct — fsrc2 — 
ftrunc.v = fsrc7, fdest ..... 6. ccc ee eer Floating-Point to Integer Conversion 
fdest <— 64-bit value with low-order 32 bits equal to integer part of fsrc7 


fxfr ISICT JOOS oe recciad gem teus cate Musauts eee eeke eee ae eeseeS Transfer F-P to Integer Register 
idest <— fsrc7 : | 
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fzchki MSICL ISIC, TOCSI vaso ae heehee es Dee oo A ae eee a A 32-Bit Z-Buffer Check 
Consider fsrc7, fsrc2, and fdest as arrays of two 32-bit | 
fields fsrc1(0)..fsrc7(1), fsrc2(0)..fsrc2(1), and fdest(0)..fdest(1) 
where zero denotes the least-significant field. 
PM < PM shifted right by 2 bits 
FOR i = Oto 1 
DO 
PM [i + 6] < fsrc2(i) < fsrce7(i) (unsigned) 
fdesti) <— smaller of fsrc2(i) and fsrc7(i) 


OD 
MERGE < 0 
fzchks ISIC) ATSICZ TOCSE oe bales he ee eae eb wR aa ae eS 16-Bit Z-Buffer Check 


Consider fsrc1, fsrc2, and fdest as arrays of four 16- bit 
fields fsrc7(0)..fsrc1(3), fsrc2(0)..fsrc2(3), and fdest(0). fdest(3) 
where zero denotes the least-significant field. 
PM <— PM shifted right by 4 bits 
FOR i = 0to3 
DO 
PM [i + 4] < fsrc2i) < fsre7(i) (unsigned) 
fdesti) <— smaller of fsrc2(i) and fsrc7(i) 


OD 
MERGE <— 0 | 

IDIOVE es cc cere oes wat RRS Rene eee Ra ea at eee uee ee Software Trap on Integer Overflow 
If OF in epsr = 1, generate trap with IT set in psr. 

ixfr ISTOTINAGOSE 0 se ob pein ys Rea AME Sa eb URE Transfer integer to F-P Register 
fdest <— isrctni | : 

Id.c CSICZ TAOS ck cue aint oe ean tee ORO ORE pas Coe eae eee Load from Control Register 
idest <- csrc2 | 

Id.x ISICT SIC) NOSE. chicas nih Siitianad a eardabane Whee a eeaeSenseataTes% ences Load Integer 
idest <— mem.x (isrc? + isrc2) | 

TOG ie iGo eae se oats oo EaK soe RRO ie a ee ee Mode Begin Interlocked Sequence 


Set BL in dirbase. The next load or store that misses the cache locks that location. 
Disable interrupts until the bus is unlocked. 


mov (SIC2 TOCST oe sesiokn c hoaiireaatee ead Rasheat esau tars eeaee! Register-Register Move 
Assembler pseudo-operation 
mov /src2, idest = shi r0, /src2, idest 


mov CONSIGZ, JOESE om8 5. oie Seu O42 ob MOKA R AAR SA eG Ae Rae eO Constant-to-Register Move 
Assembler pseudo-operation 
adds /%const32, r0, idest 
.. when const32 < 0x8000 


orh h%const32, r0, idest 
or /%const32, idest, idest 
.. when const32 = 0x8000 


NOD esc ei eee ee dears wees hee gee iu eat uaae eae ueneas Core-Unit No Operation 
Assembler pseudo-operation 
nop = shlr0, r0, r0 
or ISICT ASICZ ACSI soc Sabian uh Ratcineeetatines a tine arog pte eG ae ee eee Rees anegicet OR 
idest <— isrct OR isrc2 . | 
CC set if result is zero, cleared otherwise 


orh MCOONSE ISIOZ IOCSE Fires ni 8 8 0 he VA BRAG EAS © REE ROR ES Logical OR High 
idest <— (#const shifted left 16 bits) OR /src2 
CC set if result is zero, cleared otherwise 
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pfadd.p  fsrci, fsrc2, fdest ................. ie adauernd eo sacs ear aes Pipelined Floating-Point Add 
fdest <— last stage Adder result 
Advance A pipeline one stage 
A pipeline first stage <—- fsrc7 + ferc2- 


pfaddp fsrct, fSrC2, (deSt 0. occ eee ee cee cues ere re ee eee ..Pipelined pee with Pixel Merge 
fdest <— last stage Graphics result 
last stage Graphics result <~ /fsrc7 + fsrc2 
Shift and load MERGE register from last stage mieenics result as defined in Table 8.2 


pfaddz ISICT ISTICO NOOSE oh. 0 tilp es OLGA PERE ESA Da Do OG Aa mE Penk e BId Pipelined Add with Z Merge 
fdest <— last stage Graphics result 
last stage Graphics result <— fsrc7 + fsrc2 
Shift MERGE right 16 and load fields 31..16 and 63..48 from last stage Graphics result 


pfam.p fsrc1, fsrc2, fdest ....... dette tava edt aaah van aaah ass Pipelined Floating-Point Add and Multiply 
fdest <~ last stage Adder result | 
Advance A and M pipeline one stage (operands accessed before advancing pipeline) 
A pipeline first stage <— A-op1 + A-op2 
M pipeline first stage <- M-op1 <x M-op2 


pfamov.r fsrc1, fdest ................ ES nhcbde duane whduaw es en Pipelined Floating-Point Adder Move 
fdest <— last stage Adder result 7 
Advance A pipeline one stage 
A pipeline first stage <— /src7 
pfeq.p TSICT, ISICZ, SOOST vos bina CAG 9 SAR ORES stata arte na te Ge Pipelined Floating-Point Equal Compare 
fdest <— last stage Adder result 
CC set if fsrc? = fsrc2, else cleared 
Advance A pipeline one stage 
A pipeline first stage is undefined, but no result exception occurs 


pfgt.p | FSIC1, TSICZ AUST indo hea Ree ERRORS Pipelined Floating-Point Greather-Than Compare 
(Assembler clears R-bit of instruction) | , 
fdest <— last stage Adder result 
CC set if fsrc? > fsrc2, else cleared 
Advance A pipeline one stage | 
A pipeline first stage is undefined, but no result exception occurs 


Pfiadd.w = fsrc7, fsrc2, fdest «0.0... cece ees eee eee Pipelined Long-Integer Add 
fdest <— last stage Graphics result | 
last stage Graphics result <— fsrc7? + fsrc2 


pfisub.w = fsrc7, fsrc2, fdest .. 1... ce cee ene Pipelined Long-Integer Subtract 
fdest <— last stage Graphics result | | 
last stage Graphics result <—- /fsrc7 — fsrc2 oo 

pfix.v ISIC), TOOSL is tos saa diet ek 2Uke a Reese ie Pipelined Floating-Point to integer Conversion 
fdest <— last stage Adder result 
Advance A pipeline one stage 
A pipeline first stage <— 64-bit value with low-order 32 bits 

equal to integer part of fsrc7 rounded 


- Pipelined Floating-Point Load 


pfid.z isretisrc2), fdest ......... 0465 MAR ewe Re PAcechAReh dees eeaatsor Setepeed eaten (Normal) 
pfid.z ISTCTUSICZ) VR ADCS 6 Wine C5 GIEY Shad VaR Re Ke ewe Mee Ue LER RDS ROS (Autoincrement) 


fdest <—- mem.z (third previous pfld’s (/src7 + isrc2) s 
(where .z is precision of third previous pfid.z) 
If autoincrement 
~ THEN isrc2 <— isrc? + isrc2 
Fl 


pfle.p ISICTASIOZ, TOES? 3h oso ied one eee ees sips hin Pipelined F-P Less-Than or Equal Compare 
Assembler pseudo-operation, identical to pfgt.p except that 
assembler sets R-bit of instruction. 
fdest <~ last stage Adder result 
CC clear if fsrc7? < fsrc2, else set 
_ Advance A pipeline one stage 
A pipeline first stage is undefined, but no result Sxeeption occurs 
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pfmam.p = fsrc7, fsrc2, fdest ..... 6c eens Pipelined Floating-Point Add and Multiply 
fdest <— last stage Multiplier result 
Advance A and M pipeline one stage (operands accessed before advancing pipeline) 
A pipeline first stage <— A-op1 — A-op2 _ 
M pipeline first stage <- M-op1 <x M-op2 


PIMoVil: - ISIC), 10CSl dhs day ore Sis RECO ROMER TESS Pipelined Floating-Point Reg-Reg Move 
Assembler pseudo-operation 
pfmov.ss /fsrc7, fdest = pfiadd.ss fsrc7, f0, fdest 
pfmov.dd /fsrc?, fdest = pfiadd.dd fsrc7, f0, ‘dest 
pfmov.sd /fsrc?1, fdest = pfamov.sd /fsrc1, fdest 
pfmov.ds /fsrc?, fdest = pfamov.ds fsrc7, fdest 


pfmsm.p = fsrc7, fsrc2, fdest 0.0... ces Pipelined Floating-Point Subtract and Multiply 
fdest <~ last stage Multiplier result | 
Advance A and M pipeline one stage (operands accessed before advancing pipeline) 
A pipeline first stage <— A-op1 — A-op2 
M pipeline first stage <— M-op1 <x M-op2 
pfmul.p ISIC Ty ASICZ, TOGSE iid tehie eihice aliens Boieg aih ABIES Ota ERS ee Pes Pipelined Floating-Point Multiply 
fdest <— last stage Multiplier result 
Advance M pipeline one stage 
M pipeline first stage <— fsrc? x fsrc2 


» ipfmul3.dd (S61, (SCZ, 1dOSE oso 5552458 0 AS EARS A EE OER ERMAN Three-Stage Pipelined Multiply 


fdest <— last stage Multiplier result 
Advance 3-Stage M pipeline one stage 
M pipeline first stage <— fsrc? x fsrc2 


pform ISICT, TOOSE 6 odie cn th rth ea aie AA RE IaA EA La CUNO RE Pipelined OR to MERGE Register 
fdest <~— last stage Graphics result 
last stage Graphics result <— fsrc? OR MERGE 
MERGE <~ 0 


pfsm.p ISICT TSICZ, TAOS 5B oA ES Ra Re AONE Pipelined Floating-Point suptect and Multiply 
fdest <~— last stage Adder result | 
Advance A and M pipeline one stage (operands accessed before ecrengng pipe): 
A pipeline first stage <— A-op1 — A-op2 
M pipeline first stage <— M-op1 x M-op2 


pfsub.p fsrc1, fsrc2, fdest ........... Atal tytn a Aleta Pits lot eae en etaueanee ais Pipelined Floating-Point Subtract 
fdest <~ last stage Adder result 
Advance A pipeline one stage 
A pipeline first stage <— fsrc? + fsrc2 


pftrunc.v /fsrc/, fdest..... said aed aera acee Aen arian cae tated Pipelined Floating-Point to Integer Conversion 
fdest <~- last stage Adder result | 
Advance A pipeline one stage 
A pipeline first stage <— 64-bit value with low-order 32 bits 
~ equal to integer part of fsrc7 


pfzchkl ISIC), TSICZ TOOSE ecu Geata po 08h Pate eA eee Pipelined 32-Bit Z-Buffer Check 
Consider fsrc7, fsrc2, and fdest, as arrays of two 32-bit 
fields fsrc7(0)..fsrc7(1), fsrc2(0)..fsrc2(1), and fdest(0)..fdest1) 
where zero denotes the least significant field. 
PM < PM shifted right by 2 bits - 
FOR i = 0 to 1 
DO 
PM [i + 6] < fsrc2{(i) < fsrce7(i) (unsigned) 
fdesti) <— last stage Graphics result 
last stage Graphics result <— smaller of fsrc2(i) and fsrc7(i) 
OD 
MERGE <— 0 
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pfzchks = ferc7, fsrc2, fdeSt 0. cen ete eee nes Pipelined 16-Bit Z-Buffer Check 
Consider fsrc?, fsrc2, and fdest, as arrays of four 16-bit ity, . ; 
fields fsrc7(0)..fsr¢e7(3), fsrc2(0)..fsrc2(3), and fdest(0)..fdest3) 
where zero denotes the least significant field. 
PM <— PM shifted right by 4 bits 
FOR i = O0to3 | 
DO | 
PM [i + 4] <— fsrc2(i) < fsre7(i) (unsigned) 
fdest(i) <— last stage Graphics result 
last stage Graphics result <— smaller of fsrc2(i) and fsrc7(i) 


OD 
_ MERGE <- 0 | i 
pst.d fdest, #const(isrc2) ......... ‘initeaid obchasteaon acuacee Sigciaut Eads Ped Seca anes a el Pixel Store 


pst.d fdest, F#CONSUISICA2) + + occ cn eee ence nee te ee nenes Pixel Store Autoincrement. 
Pixels enabled by PM in mem.d (isrc2 + #const) < fdest 
Shift PM right by 8/pixel size (in bytes) bits 
IF autoincrement 
THEN /srce2 <~- #const + isrc2 


Fl 

shi ISICT, ISIC, IDOST 6.8 ii ie 68 oh i Std aas Sinaiuree4 NVA eae aare mee. Pekan Shift Left 
idest <— istc2 shifted left by /src? bits. . 

shr isrc?, isrc2, SE 8 ioe cites aeiveea dagen ae sense Ne elnbis Dee Gent anas ... Shift Right 


SC (in psr) <— ssrc7 
idest <— isrc2 shifted right by /src7 bits 


shra ISICT ISICZ OOS 2254 she egos eel Meee Ae OUSGE eae be ee eae eS Shift Right Arithmetic 


idest <— isrc2 arithmetically shifted right by src? bits — 

shrd ISICT, ISICE, JOOST si ii ccna Pah Ie ae N a ee wR seRaus 2a bash ys comatas .. Shift Right Double 
idest <— low-order 32 bits of /src7:isrc2 shifted right by SC bits — 

SLC: 2. ISICIO CSICZ 6.5 sin 3 Chhoti Bead ee OHIE Oe SSE Shs PERE AAS patore to Control Register 
—esrce2 < isrcini | 

st.x isrotni, €CONSHISICA) oo c c ccc a es i Aoteceass Peston Dares a Store Integer 
mem.x (isrc2 + #const) <- isrctni 

subs — isrcl, iSrC2, idest ... 0... ccc cee ee re eee }- eee ee ee Subtract Signed 


idest <— isrc! — isrc2 / 
OF <— (bit 31 carry * bit 30 carry) 

CC set if isrc2 > isrc7 (signed) 

CC clear if isrc2 < isrc? (signed) 


subu ISICT, ISICE, TOOST 255 bee a hea Nie eign OSES io VARA Chan Bae ee; ‘ _ Subtract Unsigned 
idest <— isrc1! — isrc2 ; | 
OF <— NOT (bit 31 carry) 
CC < bit 31 carry 
(i.e. CC set if isro2 < isrct fuindignad) 
CC clear if isrc2 > isrc? (unsigned) 


trap ISIOTUMISOZ MAO iia teten caves sovtnrdeans Oendbarwawiadnuatutd aden anes . Software Trap 
Generate trap with IT set in psr : 


unlock ...............6- iat oe as ile batten Weta Ghd eRe eA yea Aen End Interlocked Sequence 
Clear BL in dirbase. The next load or store unlocks the bus. 
Enable interrupts after bus is unlocked. 


xor ISTCT, ISICZ, IDEST oo os deccaie Na whee e i Aas etna eet oae ah Gneerieyeee Logical Exclusive OR 
idest <— isrc1 XOR isre2 | : 
CC set if result is zero, cleared otherwise , #. ¢ 

xorh PCONSY, ISICL JOCSE 5200 ho 52h Gp UN te eae eae RAGE ha mw bh jet ins Logical Exclusive OR High 
idest <— (#const shifted left 16 bit) XOR /src2 
CC set if result is zero, cleared otherwise © 
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Table 8.2. FADDP MERGE Update 


Right Shift 
Amount 
(Field Size) 


Fields Loaded From 
Result into MERGE 


63..56, 47..40, 31..24, 15..8 
63..58, 47..42, 31..26, 15..10 
63..56, 31.24 


8.2 Instruction Format and Encoding 


All instructions are 32 bits long and begin on a four- 
byte boundary. When operands are registers, the 
register encodings shown in Table 8.3 are used. 
There are two general core-instruction formats, 
. REG-format and CTRL-format, as well as a separate 
format for floating-point instructions. 


8.2.1 REG-FORMAT INSTRUCTIONS 


Within the REG-format are several variations as 
shown in Figure 8.1. Table 8.4 gives the encodings 
for these instructions. One encoding is an escape 
code that defines yet another variation: the core es- 
cape instructions. Figure 8.2 shows the format of 
this group, and Table 8.5 shows the encodings. | 


In these instructions, the src2 field selects one of 
the 32 integer registers (most instructions) or five 
control registers (st.c and Id.c). Dest selects one of 
the 32 integer registers (most instructions) or float- 
ing-point registers (fld, fst, pfid, pst, ixfr). For in- 
structions where src7 is optionally an immediate val- 
ue, bit 26 of the opcode (I-bit) indicates whether src7 
is an immediate. If bit 26 is clear, an integer register 
is used; if bit 26 is set, src7 is contained in the low- 
order 16 bits, except for bte and-btne instructions. 
For bte and btne, the five-bit immediate value is 
contained in the src7 field. For st, bte, btne, and 
bla, the upper five bits of the offset or broffset are 
contained in the dest field instead of src7, and the 
lower 11 bits of offset are the lower 11 bits of the 
instruction. 
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Table 8.3. Register Encoding 


Encoding 


rO 


r3 
fO 


{31 
Fault Instruction 
Processor Status 
Directory Base 
Data Breakpoint _ 
Floating-Point Status 
Extended Process Status 


0 
{ 
2 
3 
4 
5 


For Id and st, bits 28 and zero determine operand 


size as follows: 
| Bit 28 Operand Size 


0 


0 
0 1 
1 0 
1 1 


When src7 is an immediate and bit 28 is set, bit zero 
of the immediate value is forced to zero. 


For fid, fst, pfild, pst, and flush, bit 0 selects autoin- 
crement addressing if set. For fid, fst, pfld, and 
pst, bits one and two select the 
follows: 


operand size as 


| Operand Size | 


64-bits 
128-bits 


32-bits 
32-bits 


When src7 is an immediate value, bits zero and one 
of the immediate value are forced to zero to main- 
tain alignment. When bit one of the immediate value 
is clear, bit two is also forced to zero. 


For flush, bits one and two must be zero. 
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General Format 
15 


OPCODE/ | | est] srcr IMMEDIATE, OFFSET;OR NULL 


16-Bit Ba eae a (except bte and btne) | 
0 


OPCODE IL SRC2 DEST | IMMEDIATE 7 


st, bla, oo ue btne 
20 


25 | ; 
OFFSET SRC1 , 


| bts and btne with — meee | | ; 
31 85 7 0 
OPCODE A SRC2 sie IMMEDIATE OFFSET LOW 


Figure 8.1. REG-Format Variations 


6-68 


i860™™ MICROPROCESSOR 


PRELIMINARY 


Table 8.4. REG-Format Opcodes 


31 
id.x Load Integer 0) 0 0 L 0 | 
st.x Store Integer 0 0 0 L 1 1 
ixfr Integer to F-P Reg Transfer 0 0 0 0 1 0 
(reserved) 0 0 0 1 1 0 
fid.x, fst.x Load/Store F-P 0 0 1 0 LS | 
flush Flush 0 0 1 1 0 1 
pst.d Pixel Store 0 0 1 1 1 1 
Id.c, st.c Load/Store Control Register 0 0 1 1 LS 0 
bri Branch Indirect 0 1 0 0 0 0 
trap Trap 0 1 0 0 0 1 
(Escape for F-P Unit) 0 1 0 0 1 0 
(Escape for Core Unit) 0 1 0 0 1 1 
bte, btne Branch Equal or Not Equal 0 1 0 1 E | 
pfid.y Pipelined F-P Load 0 1 1 0 0 | 
(CTRL-Format Instructions) 0 1 1 Xx x Xx 
addu, -s, subu, -s, Add/Subtract 1 0 0 SO AS | 
shi, shr _ Logical Shift 1 0 1 0 LR | 
shrd Double Shift 1 0 1 1 0 0 
bla Branch LCC Set and Add 1 0 1 1 0 1 
shra , Arithmetic Shift 1 0 1 1 1 | 
and(h) 1 1 0 0 H | 
andnot(h) 1 1 0 1 H | 
or(h) 1 1 1 0 H | 
xor(h) 1 1 1 1 H | 
(reserved) 1 1 X x 1 0 
L Integer Length AS Add/Subtract 
0 —8 bits 0 —Add 
1 -—16 or 32 bits (selected by bit 0) 1 —Subtract 
LS Load/Store LR Left/Right 
0 —Load 0 —Left Shift 
1 —Store 1 —Right Shift 
SO Signed/Ordinal E Equal 
0 . —Ordinal . 0 —Branch on Not Equal 
1 —Signed 1 —Branch on Equal 
H ~~ High | Immediate 
O —and, or, andnot, xor 0 —src7 is register 


1 —andh, orh, andnoth, xorh 1 -——src7 is immediate 


31 26 15 40 5 0 


Figure 8.2. Core Escape Instruction Format 
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Table 8.5. Core Escape Opcodes 
a 


(reserved) 
lock Begin Interlocked Sequence 
Calli Indirect Subroutine Call 
(reserved) 
intovr Trap on Integer Overflow 


(reserved) 

| | (reserved) 
unlock — End Interlocked Sequence 

(reserved) 

(reserved) 

(reserved) 


xx x =O O30 0/0 — 


-“-=-=O00000000 
=O-0O00000000 
xxx aso OOOO 
Week gerne, 2 agrees ees 


8.2.2 CTRL-FORMAT INSTRUCTIONS 


The CTRL instructions do not refer to registers, so instead of the register fields, they have a 26-bit relative 
branch offset. Figure 8.3 shows the format of these instructions and Table 8.6 defines the encodings. | 


31 28 25 , 0 


BROFFSET is a signed 26-bit relative branch offset. 


Figure 8.3. CTRL Instruction Format 


Table 8.6. CTRL-Format Opcodes 
28 26 


(reserved) — 
| (reserved) © 
br Branch Direct 


call Call | 
bce(.t) Branch on CC Set 
bne(.t) | Branch on CC Clear 


T. Taken 
0 —bc or bnce 
1 —bc.t or bne.t 
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8.2.3 FLOATING-POINT INSTRUCTIONS 


The floating-point instructions also constitute an escape series. All these instructions begin with the bit se- 
quence 010010. Figure 8.4 shows the format of the floating point instructions, and Table 8.7 gives the encod- 
ings. Within the dual-operation instructions is a subcode DPC whose values are given in Table 8.8 along with 
the mnemonic that corresponds to each. | 


31 0 


25 20 15 a 


SRC1, SRC2 —Source; one of 32 floating-point registers 
DEST —Destination register 


(instructions other than fxfr) one of 32 floating-point registers 
(fxfr) one of 32 integer registers 


P Pipelining S Source Precision 
1 —Pipelined instruction mode 1 —Double-precision source operands 
0 —Scalar instruction mode 0 —Single-precision source operands 
D Dual-Instruction Mode R_ Result Precision 
1 —Dual-instruction mode 1 -—Double-precision result 
0 —Single-instruction mode . 0 —Single-precision result 


Figure 8.4. Floating-Point Instruction Encoding 


Table 8.7. Floating-Point Opcodes 
6 


Add and Multiply* 

Multiply with Add* 

Subtract and Multiply* 
_Multiply with Subtract* 


(p)fmul Multiply 

fmlow Multiply Low 

frep _ Reciprocal 

frsqr Reciprocal Square Root 
pfmul3.dd = ~—3-Stage Pipelined Multiply 


(p)fadd Add — 
(p)fsub Subtract 
(p)fix | Fix — 
(p)famov Adder Move 
pfgt/pfie** Greater Than 
pfeq Equal 
(p)ftrunc Truncate 


~OoOoAncolo--aoo 


fxfr Transfer to Integer Register 
(p)fiadd _ Long-Integer Add | 
(p)fisub Long-Integer Subtract 


(p)fzchkl Z-Check Long 

(p)fzchks Z-Check Short 

(p)faddp Add with Pixel Merge 
(p)faddz Add with Z Merge 
(p)form OR with MERGE Register 


*pfam and pfsm have P-bit set; pfmam and pfmsm have P-bit clear. 
**pfgt has R bit cleared; pfle has R bit set. 


NOTE: 
All opcodes not shown are reserved. 


-+-+-|-+++|s000006 ooo0o oe 


ne) 
0 
0 
0 
0 
n°) 
0 
0 
0 
0 
0 
4 
0 
1 
1 
0 
1 
0 
0 
1 


Co0oH-/42A00/0-=-cO0O0/“CD00 
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The following table shows the opcode mnemonics that generate the various encodings of DPC and explains 
each encoding. 


- Table 8.8. DPC Encoding 


ratip2 
mi2apm 
ralp2 
mi2ttpa 


mrmttp2 
mm12mpm 
mrmip2 
mm 12ttpm 


mimt1p2 
mm12tpm 
mimip2 


ratis2 
mi2asm 
rais2 
m12ttsa 


, latis2. 


mi2tsm 
ia1s2 
mi2tsa 


mr2s1 
mr2st 
mr2mst1 
mr2mst 
mi2s1 
mi2st 
mi2ms1 
mi2mst 


mrmtis2 
mmi12msm 
mrmis2 


mmi2ttsm | 


mimt1s2 
mmi2tsm 
mim1s2 


A result 
src2 
A result 
src2 


A result 
src2 
A result 
src2 


M result 
src2 
M result 
src2 


M result 
src2 
M result 


srci 
T 
src 


src 
T ; 
; src 
T 


srci 
A result 
ie 


A-Unit 
opt 


src | 
| T 
src 
T 


srci 
oT 

src1 
T 


src 
M result 
src 
T- 


srci 
Th 
src 


M result 
M result 
A result 
A result 


M result 
M result 
A result 
A result 


src2 
M result 

src2 
A result 


src2 
M result 
_ src2 
A result 


M result 


M result | 


M result 


M result 
M result 
M result 


M result | 


Ssrc2 
M result 
src2 
A result 


src2 
M result 
_ src2 


- 
Yes 
Yes 
No 
No 


We 
Yes 
Yes 
No 
No 


Yes 
No 
Yes 


No 
Yes 
No 
Yes 


No 
No 
No 
No 


No 
No 
No 


A-Unit K 
| op2 ued Load* 


-Mresult | 


Yes 
No 


Yes 


No 
Yes 
No 
Yes 


No 


No 
No 


- No. 


Intel-Reserved 
*If K-load is set, KR is loaded when operand-1 of the multiplier is KR; KI is loaded when operand-1 of the multiplier is KI. 
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i860 microprocessor instructions take one clock to 
execute unless a freeze condition is invoked. Freeze 
conditions and their associated delays are shown in 
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the table below. Freezes due to multiple simulta- 
neous cache misses result in a delay that is the sum 
of the delays for processing each miss by itself. Oth- 
er multiple freeze conditions usually add only the de- 
lay of the longest individual freeze. 


Instruction-cache miss 


Reference to destination of Id instruction that 
misses 


fid miss 


call, calli, ixfr, fxfr, Id.c, or st.c and data cache 
load miss processing in progress 


- Id/st/pfld/fld/fst and data cache load miss 
processing in progress 


Reference to dest of Id, call, calli, fxfr, or Id.c in 
the next instruction. (Dest of call and calli is r1.) 


Number of clocks to read instruction (from ADS 
clock to first READY # clock) plus time to last 
READY # of block when jump or freeze occurs 
during miss processing plus two clocks if data- 
cache being accessed when instruction-cache 
miss occurs. 


One plus number of clocks to read data (from 
ADS # clock to first READY # clock) minus number 
of instructions executed since load (not counting 
instruction that references load destination) 


One plus number of clocks until first READY # 
returned (for 32- or 64-bit read cycles) or until 
second READY # returned (for 128-bit fld.q read 


cycles) 


One plus number of clocks until first READY # 
returned (for 64-bit read cycles) or until second 
READY # returned (for 128-bit fid.q read cycles) 


One plus number of clocks until last READY # 
returned 


One clock | 
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Delay © 


Reference to dest of fid/pfid/ixfr in the next ine. 
instructions 


be/bne/be.t/bnc.t following fadd/fsub/pfeg/ 
pfgt 


Fsrc? of multiplier operation refers to result of 
previous operation 


Floating- point operation or graphics-unit 
instruction or fst, and scalar operation in progress 
other than frep or frsqr 


Multiplier operation preceded by a double: 
precision multiply | 


Multiplier operation with data pattern a 
extra rounding operation 


TLB miss 


pfid when three pfid’s are outstanding 
pfid hits in the data cache 


st, pst or fst miss, Id miss, or flush with modified 
block when store path full (two stores or one 256- 
bit write-back internally waiting for bus plus 
external bus pipeline full) 


\ 


Id, fld, pfid, st, pst, or fst when address path full 
(one address internally waiting for bus plus 
external bus pipeline full) 


Id/fld following st/fst hit 


Two clocks in the first instruction; one in the 
second instruction 3 


One clock 
One clock 


If the scalar operation is fadd, fix, fmiow, fmul.ss, 
-fmul.sd, ftrunc, or fsub, two minus the number of 
instructions (or dual-mode pairs) already executed 
after the scalar operation. If the scalar operation is 
fmul.dd, three minus the number of instructions _ 

(or dual-mode pairs) executed after it. Add one if 

either or both of these two situations occur: 

1. There is an overlap between the result register 
of the previous scalar operation and the source 
of the floating-point operation, and the 
destination precision of the scalar operation is 
different than the source precision of the 
floating-point operation. 

2. The floating-point operation is pipelined and its 
destination is not f0. : 

There is no delay if the result is negative. 


One clock 


One clock 


Five plus the number of clocks to finish two reads 
plus the number of clocks to set A-bits (if 
necessary) 


- One plus the number of clocks to return data from 
first pfld 


Two plus the number of clocks to finish all 
outstanding accesses 


One plus the number of clocks until READY # 
active on next 64-bit write cycle or second 
READY # of next 128-bit write cycle. 


~ Number of clocks until next nonrepeated address 
can be issued (i.e., an address that is not the 2nd- 
Ath cycle of a cache fill, the 2nd—8th cycle of a 

CS8 mode instruction fetch, nor the 2nd cycle of a 
128-bit write) 


One clock 
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Delayed branch not taken One clock 


Nondelayed branch taken: 
be, bne One clock 
bte, btne | Two clocks 


Indirect branch bri or call calli One clock 
st.c Two clocks 


Result of graphics-unit instruction (other than One clock 
fmov.dd) used in next instruction when the next 
instruction is an adder- or muitiplier-unit instruction 


Result of graphics-unit instruction used in next One clock 
instruction when the next instruction is a graphics- 
unit instruction 


flush followed by flush Three clocks minus the number of instructions 
7 between the two flush instructions. There is no 
delay if the result is negative. 


fst or pst followed by pipelined floating-point One clock 
operation that overwrites the register being stored 
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8.4 Instruction Characteristics 


The following table lists some of the characteristics 
_ of each instruction. The characteristics are: 


e What processing unit executes the instruction. 
The codes for processing units are: 
A Floating-point adder unit 
E Core execution unit 
G Graphics unit 
M Floating-point multiplier unit 


e Whether the instruction is pipelined or not. A P 
indicates that the instruction is pipelined. 


e Whether the instruction is a delayed branch in- 
struction. A D marks the delayed branches. 


e Whether the instruction changes the condition 
code CC. A CC marks those instructions that 
change CC. 


e Which faults can be caused by the instruction. 
The codes used for exceptions are: 


IT Instruction Fault 


SE Floating-Point Source Exception 

RE Floating-Point Result Exception, including 
overflow, underflow, inexact result 

DAT Data Access Fault 


_1860™ MICROPROCESSOR 


Note that this is not the same as specifying at 


which instructions faults may be reported. A re- 
sult exception is reported on the subsequent 
floating-point instruction, pst, fst, or sometimes 
fid, pfid, and ixfr. 


The instruction access fault |AT and the interrupt 
trap IN are not shown in the table because they 
can occur for any instruction. 


e Performance notes. These comments regarding 
optinum performance are recommendations 
only. If these recommendations are not followed, 
the i860 microprocessor automatically waits the 
necessary number of clocks to satisfy internal 
hardware requirements. The following notes de- 
fine the numeric codes that appear in the instruc- 
tion table: 


1. The following instruction should not be a con- 
ditional branch (be, bne, be.t, or bne.t). 


2. The destination should not be a source Oper- 
and of the next two instructions. 
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3. A load should not directly follow a store that is 
_ expected to hit in the data cache. 


4.When the prior instruction is scalar, ferc7 
should not be the same as the fdest of the 
prior operation. 


5. The fdest should not aloisnee the destination 
of the next instruction if that instruction is a 
pipelined floating-point operation. | 


6. The destination should not be a source oper- 


and of the next instruction. (For call ae calli, 
the destination is r1.) 


7. When the prior operation is scalar and multipli 
er op7 is fsrc7, fsrc2 should not be the same 
as the fdest of the prior operation. — . 


8. When the prior operation is scalar, fsrc7 and 
fsrc2 of the current operation should not be the 
same as fdest of the prior operation. 


9. A pfld should not immediately follow a pfid. 


Programming restrictions. These indicate combi- 
nations of conditions that must be avoided by 
programmers, assemblers, and compilers. The 
following notes define the alphabetic codes that 


‘appear in the instruction table: 


a. The sequential instruction following a delayed 
control-transfer instruction may not be another 
control-transfer instruction (except in the case 
of external interrupts), nor a trap instruction, 
nor the target of a control-transfer instruction. 


b. When using a bri to return from a trap handler, 
programmers should take care to prevent traps 
from occurring on that or on the next sequen- 
tial instruction. IM should be zero (interrupts 
disabled) when the bri is executed. 


c. If fdestis not zero, fsrc7 must not be the same 
as fdest. 


d. When fsrc7 goes to the multiplier i KR, or 
KI, fsrc7 must not be the same as /dest. 


e. If fdest is not zero, fsrc? and fsrc2 must not be 
the same as fdest. 


f. src? must not be the same as isrc2 for the 
autoincrementing form of this instruction. 


g. src7 must not be the same as /src2. 
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Table 8.9 Instruction Characteristics 


adds 

addu EC 
and CC 
andh CC 
andnot CC 
andnoth CC 
bc | 

be.t 

bla 

bne : 


id _ 


frsqr.p 
fst.y 
fsub.p 
ftrunc.p 
txfr 


fzchkl 
fzchks 
intovr 
ixfr 
id.c 


evens lnnnaslovooaeoeesoreesoennn 
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Pipelined? — 
‘Delayed? 


Uv 


 pfaddz 
pfam.p 
pfamov.r 
pfeq.p 
pfgt.p 
pfiadd.z 
pfisub.z 
pfix.p 
pfid.z | 
pfmam.p 


pfmsm.p 
pfmul.p 

_ pfmul3.dd 
pform 
pfsm.p 
pfsub.p 


pftrunc.p 
pfzchkli 
pfzchks 
pst.d 

shl 


VUVUVjI VG VV VV; VV VU; UU U0 UU 
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DATA SHEET REVISION REVIEW 


The following list represents the key differences be- 
tween version 002 and version 001 of the i860 Mi- 
croprocessor Data Sheet. 


1. 


2. 


Big-endian description in section 2.3 has been 
expanded. 


Bit 17 of the Extended Processor Status Regis- 
ter (EPSR) is the INT bit which reflects the value 
on the interrupt pin (INT), as described in section 
2.2.4 entitled “EXTENDED PROCESSOR 
STATUS REGISTER”. This is a documentation 
update only. 


The cacheability of a page is controlled by 
NOR’ing the value of the CD, WT bits and the 
KEN # input pin, as described in section 2.5 enti- 
tled “Caching and Cache Flushing” and section 
3.1.14 entitled “Cache Enable (KEN#)”’. This is 
a documentation update only. 


The NOTE section in section 2.5 entitled “Cach- 
ing and Cache Flushing’ has been updated to 
Clarify the paging requirement on changing the 
DTB field in the dirbase register. 

Information on register encoding is added in sec- 


tion 8.2 entitled “Instruction Format and Encod- 
ing’. This is a documentation update only. 


The following list represents the key differences be- 
tween version 003 and version 002 of the i860 Mi- 
croprocessor Data Sheet. 


Specification Changes: 


1. 


2. 


Specification changes for improved AC perform- 
ance are in section 7.3. 


HOLD is acknowledged during locked bus cy- 
cles. See section 3.1.8. 


Additional paths have been added to the bus 
state diagram to allow direct transitions from 
states T12 and T11 to state TH. See Figures 4.1 
and 4.10. 


Two new instructions, (p)famov.r, have been 
added. These’ replace (p)fadd.ds and 
(p)fadd.sd in the assembler pseudo-ops 
(p)fmov.r. These changes are in section 8.1 and 
tables 2.7, 8.7, and 8.9. 


Documentation Changes: 


1, 


Big and little endian description has been ex- 
panded in sections 2.2.2, 2.3, and Figure 2.8. 
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2. 


14. 


PRELIMINARY 


The actions and explanations of the lock, un- 
lock, and st.c dirbase changing the BL bit have 
been updated in sections 2.2.4, 3.1.5, 3. 1.8, 

4.3.4, 4.3.5, and 8.1. . 


The explanation of the AA and MA bits of the - 
fpsr have been expanded in section 2.2.8. 


The explanation of the WT bit of the Page Table 
Entries has been expanded in sections 2.4.4.4 
and 2.5. 


A change concerning the locking of the bus dur- 
ing address translation is explained in sections 
2.4.5 and 2.8.5. 


A further explanation on when to flush the data 
cache is given in section 2.5. 


The explanation of the floating point multiplier 
pipeline has been expanded in section 2.6.1. 


The explanation of BREQ has been expanded in 
section 3.1.4 and Figure 4.1. 


The explanation of result exceptions has been 
expanded in sections 2.8 and 3.2. 


. Instruction fetch identification has been clarified 


in section 3.1.6 and table 3.2. 


. Bus cycle diagrams in Figures 4.7, 4.8, and 4.10 


have been clarified/corrected. 


. Precision specification .r has been added to sec- 


tion 8.0 and table 8.1. 


.In section 8.4, performance note 9 has been 


added, programming restriction d has been 
changed, and programming restriction f has 
been added. Table 8.9 has been updated to re- 
flect these changes. 


The description of testability has changed in 
sections 3.3. and 3.3.2. RESET and HOLD must 
be asserted by the tester to force the chip out- 
puts to float (tri-state). 


The following list represents the major differences 
between this version and version 003 of the i860 
Microprocessor Data Sheet: 


Section 2.2.4 
Section 2.8.2 
Section 2.8.4 
Section 2.8.7 


Section 3.1.4 


The explanation of the WP bit of the 
epsr has been expanded. 


More information on the instruction 
trap has been added. 


The instruction access trap has 
been clarified. 


The values of registers after a reset 
trap have been specified. 


BREQ timing has been clarified. 
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Section 3.1.5 
Section 3.1.6 


Section 3.1.8 


Section 6.0 
Section 7.3 


Section 7.3 


Section 8.0 


Section 8.2.1 
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The calculation of interrupt latency 
has been corrected. 


The description of the byte-enable 
signals has been expanded. 


The relation between the lock in- 
struction and the LOCK # signal has 
been clarified. The BL bit should no 
longer be changed by mae to the 
dirbase register. | 


The thermal specifications nave 
been updated. 


The A.C. characteristics for CLK 
have changed. 


Advance timing information for the 
50 MHz clock rate has been added. 
These timings are subject to change 
without notice. Contact Intel Corpo- 
ration for design-in information. 


Section 2.8.4 
Section 2.8.7 


Section 3.1.4 
Section 3.1.5 


Section 3.1.6 


Section 3.4 8 


. Section 6.0 


The operand naming conventions _ 


have improved. 


The encoding of the flush instuction 
has been corrected. | 


The following list represents the major differences 
_between this version and version -003 of the i860 
Microprocessor Data Sheet: 


Section 2.2.4 


Section 2.8.2 


The explanation of the WP bit of the 
espr has been expanded. 


More information on the instruction - 


trap has been added. 
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Section 7.3 


Section 7.3 


Section 8.0 — 
Section 8.2.1 


Section 8.3 
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The instruction access trap has been 
Clarified. 


The values of registers after a reset 
trap have been specified. 


BREQ timing has been clarified. 


The calculation of interrupt latency 
has bee corrected. | 


The description of the byte-enable 
signals has been expanded. 


The relation between the lock 
instruction and the LOCK# signal has © 
been clarified. The BL bit should no 
longer be changed by writing to the 
dirbase register. 


The thermal specifications have been 
updated. 


The A.C. Characteristics for ak have 
changed. 


Advance timing information for the 50 
MHz clock rate has been added. 
These timings are subject to change 
without notice. 


The operand naming conventions 


have improved. 


The encoding of the flush instruction 
has been corrected. 


The data-dependent multiplier freeze 
has been eliminated. Other freeze 
conditions have been corrected or 
clarified. 
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Introduction 


The i860!M 64-bit microprocessor is a general-purpose 
CPU with on-chip integer unit, floating point, memory 
management, caches, and graphics. The i860 micro- 
processor supports 3-D graphics software with the fol- 
lowing functions: | 


1. Hidden surface elimination 
2. Distance interpolation 
3. Intensity interpolation for 3-D shading — 


The fzchks (Z-buffer Check) and pst (Pixel Store) in- 
structions expedite hidden surface elimination. Dis- 
tance interpolation is accomplished with faddz (Add 
with Z merge), and intensity interpolation occurs with 
faddp (Add with Pixel Merge). The purpose of this ap- 
plication note is to illustrate the intended use of these 
instructions in a manner independent of any graphics 
environment in which the instructions might be used. It 
is not the purpose of this application note to present the 
most efficient instruction sequences. While the inner 
loop of Example 7 has as few instructions as logically 
possible, the other examples are intended to present 
' general concepts, not optimum implementations. Tun- 
ing for maximum performance depends on the specific 
environment. 


This application note assumes familiarity with the 
i8601M 64-bit Microprocessor Programmer’s Reference 
Manual (Intel order number 240329); the i860 micro- 
processor instructions for graphics are detailed in sec- 
tion 6.6. 


1.0 3-D RENDERING 


This series of examples are routines that might be used 
at the lowest level of a graphics software system to con- 
vert a machine-independent description of a 3-D image 
into values for the frame buffer of a color video display. 
Typically, higher-level graphics routines represent an 
object as a set of polygons that together roughly de- 
scribe the surfaces of the objects to be displayed. The 
graphics system maintains a database that describes 


// SET PIXEL SIZE TO 16 


AP-434 


these polygons in terms of their colors, properties of 
reflectance cr translucence, and the locations in 3-D 
space of their vertices. Due to the roughness. of the 
representation, the amount of information in the data- 
base is considerably less than that which must be deliv- 
ered to the video display. A rendering procedure, such 
as Example 7, uses interpolation to derive the detailed 
information needed for each pixel in the graphics frame 
buffer. The rendering procedure also performs pixel-by- 
pixel hidden-surface elimination. 


The focus of this series of examples is Example 7, 
which operates on a segment of a scan line. The seg- 
ment is bounded by two points of given location and 
color: from point (XJ, YO, Z1) with color intensities 


Red1, Grn1, Blul to point (X2, YO, Z2) with color in- 


tensities Red2, Grn2, Blu2. The points and color inten- 
sities are determined by higher-level graphics software. 
The points represent the intersection of the scan line 
with two edges of the projected image of a polygon. For 
a given scan line, the rendering procedure is executed 
once for each polygon that projects onto that scan line. 
The higher-level graphics software is responsible for 
orienting the objects with respect to the viewer, for 
making perspective calculations, for scaling, and for de- 
termining the amount of light that falls on each poly- 
gon vertex. . 


The 16-bit pixel format is used, giving ample resolution 
for color shading: 26 intensity values for red, 26 intensi- 
ty values for green, and 24 intensity values for blue. 
Example 1 shows how to set the pixel size. For hidden- 
surface elimination, the Z-buffer (or depth buffer) tech- 
nique is employed, each Z value having a resolution of 
16-bits. | 


Because the examples presented here use almost all of 
the registers of the i860 microprocessor, the registers 
are given symbolic names, as defined by Example 2. In 
a real application, it is likely that some of the inputs to 
the rendering procedure would be passed in floating- 
point registers instead of the integer registers employed 
here. The register allocation shown in Example 2 sim- 
plifies the examples by avoiding the need to use any 
register for multiple purposes. 


ld.c psr, Ra // Work on psr 
andnoth Ox00CO, Ra, Ra// Clear PS 

orh 0x0040, Ra, Ra// PS = 16-bit pixels 
Sst.c Ra, psr // 


Exampie 1. Setting Pixel Size 
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// REGISTER DEFINITIONS FOR RENDERING PROCEDURE 
INTEGER LOCALS 


// 


// 


// 
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Ra = r4 // Temporary 
Rb = r5 // Temporary 
Re = r6 // Temporary 
Rd aor? // Temporary 

INTEGER INPUTS 
Xl = rl6 // X coordinate of starting point of line segment in pixels 
ax = rl7 // Width of scan line segment in number of pixels 
ZBP =rl8 // Z=buffer pointer to the current line segment 
Z1 =rl9 // Initial Z value, fixed-point 16.16 format 
mZ = r20 // Z Slope, fixed-point 16.16 format 
FBP = r2l // Graphics frame buffer pointer to the current line seauent 
Redl = r22 // Initial red intensity, fixed-point 6.10 format, plus .5 
Grnl = r23 // Initial green intensity, fixed-point 6.10 format, plus .5 
Blul = r24 // Initial blue intensity, fixed-point 6.10 format, plus .5 
mR = r25 // Red Slope, fixed-point 6.10 format 
mG = r26 // Green Slope, fixed-point 6.10 format 
mB = r27 // Blue Slope, fixed-point 6.10 format 

REAL LOCALS 
aZ = f2 // Accumulated Z values 
aZh = f3 // | 
iZl = f4 // Z interpolant, coefficient 1.0 
iZlh = f5 // 
iZ3 = f6 // Z interpolant, coefficient 3.0 
iZ3h = f7 // . 
oldz = f8 // Original values from the Z-buffer 
newz = f10 // New Z-buffer values 
newzh = fll // 
newi = fl12 // New pixel values 
iR =f14 // Red interpolant, coefficient 4.0 
iRh = f15 // | 
aR = [16 // Accumulated red intensities 
aRh = f17 // 
iG = f18 // Green interpolant, coefficient 4.0 
igh = f19 // 
aG = £20 // Accumulated green intensities 
aGh = f2l1 // 
iB = f22 // Blue interpolant, coefficient 4.0 
iBh = f23 // 
aB = f24 // Accumulated blue intensities 
aBh = f25 // 
1Zmask = f26 // left-end Z mask 
lZmaskh = £27 // 
rZmask = £28 // right-end Z mask ~ 
rZmaskh = f29 // 


Example 2. Register Assignments 
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2.0 DISTANCE INTERPOLATION — 


To perform hidden surface elimination at each pixel, 
the rendering routine first interpolates the value of Z at 
each pixel. Distance interpolation consists of calculat- 
ing the slope of Z over the given line segment, then 
increasing the Z value of each successive pixel by that 
amount, selene from XI, “The width of the line seg- 
ment in pixels is . | 


dX = X2 — X 1 
Calculate the reciprocal of dX: - 
RdX = 1/dX 
The value of dX is used several times as a divisor. It is 
most efficient to calculate its reciprocal once, then, in- 
stead of dividing by dX, multiply by RdX. The slope of 
Zis... 

= (Z2 — Z1)*RdX 


Because each polygon is a plane, the value of mZ is 


constant for all scan lines that intersect the polygon; 


therefore mZ needs to be calculated only once for each 


polygon. Example 7 assumes that dX and mZ have al- 
ready been calculated, and all that remains is to apply 
mZ to successive pixels. Let Z(Xn) be the Z value at 

pixel Xn. Then... | 


Z(X1) = 


Z(X1 + 1) = Z1 + mZ 
Z(X1 + 2) = Z1 + 2*mZ 


Z(X1 + N) = Z1 + N*mZ . 
Z(X1 + dX) =Z1 + dX*mZ = Z(X2) 
Figure 1 illustrates this Z-value interpolation. 


The faddz instruction helps to perform the above calcu- 
lations 64 bits at a time. Because a Z value is 16 bits 


_- wide, Example 7 operates on the Z buffer in groups of 


four. The faddz instruction, however, treats the interpo- 
lation values (N*mZ) as 32-bit fixed-point numbers; 
therefore, two faddz instructions are executed for each 


group of four pixels. Because of the way the faddz shifts 


(r,g,b,x,y,z = 4000) 


Z1 = 2400 


(r’,g’, b’,x', y’,z’ = 800) 


22 = 3000 


3000-2400 


mes 12 pixels 


(r”’, g”, b", 


ee 240856-1 


Figure 1. Z-Buffer Interpolation | 
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the MERGE register, the first faddz corresponds to 
even-numbered pixels, while the second corresponds to 
odd-numbered pixels. Instead of starting with the value 
for the first pixel (Z(X/)) and adding mZ to each pixel 
to produce the value for the next pixel, the example 
procedure starts with the values for the first two even- 
numbered pixels and adds 1*mZ to each of these values 
to produce the values for the adjacent odd-numbered 
pair. Adding 3*mZ to each of the Z values of an odd- 
numbered pair produces the values for the next even- 
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numbered pair. Figure 2 shows one way of constructing 
the operands before starting the distance interpolations. 
(The initial value given to src] depends on the align- 
ment of the first pixel.) Table 1 helps to visualize the 
process. 


After two faddz instructions, the MERGE register 
holds the Z values for four adjacent pixels (in the cor- 
rect order). The form instruction copies MERGE into 
one of the 64-bit floating-point registers. 


Accumulator 


31 


fraction 


Z1-3.0*mZ — 


0 


Initial 


fraction 
src1 


Interpolants 


31 


fraction 


fraction ~ 


3.0*mZ fraction 


Second 


f ion 
ractio a 


‘Figure 2. faddz Operands 


Table 1. faddz Visualization 


rdest/srct 
 grc2 


{ 


4 


Because the values of Z7 and mZ are constant for each loop through the rendering routine, the numbers shown here are 
the values of the coefficient V, where the actual operands have the values 77 + N*mdZ. For each execution of faddz, src7 
is the same as rdest of the prior faddz. After every two faddz instructions, a form instruction empties the MERGE register. 


inte 
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Mf CONSTRUCT. INTERPOLANTS iZl AND iZ3 GIVEN mZ 
ixfr /  -mZ, iZl // Join each half in 64-bit register 
shl Ls mZ, Ra // Ra = 2*mZ 
adds Ra, mZ, Ra // Ra = 3*mZ 
ixfr Ra, 1Z3 // Join each half in 64=bit register 
fmov.ss ‘iZil, iZlh // Join each half in 64-bit register 
fmov.ss iZ3, iZ3Sh // Join each half in 64=bit register 
| Example 3. Construction of Z Interpolants ' | 
(r = 20,9,b,x, y,z) 
Red Color. 


(0-63) 


27-30 
~ 412 pixels 


(r' = 40,g',b'.x', yz") 


r’’ = 40, he b"’, x", ios 7" : 
( : z 240856-2 


Figure 3. Pixel Interpolation for Gouraud Shading 
/ 


The same register is used as both srcJ and rdest in all 
faddz instructions. This register serves to accumulate Z 
values for successive pixels; therefore, it is called an 
accumulator. The registers used as src2 are called inter- 
polants. The code in Example 3 constructs the interpo- 
lants; it needs to be executed only once for each poly- 
gon. 


3.0 COLOR INTERPOLATION 


To determine the RGB color intensities at each pixel, 
the rendering routine interpolates between the color in- 
tensities at the end points. (This rendering technique is 
called “Gouraud shading” after H. Gouraud, “‘Contin- 
uous Shading of Curved Sufaces,” JEEE Transactions 
on Computers, C-20(6), June 1971, pp. 623-628.) Let 
the symbol C (color) represent either R (red), G 
(green), or B (blue). Color interpolation consists of cal- 
culating the slope of C over the given line segment, then 
increasing the C values of each successive pixel by that 
amount, starting from the values for XJ. This must be 
done for C=R, C=G, and C=B. The slope of Cis... 


= (C2 — Cl)*RdX 


. where RdX = 1/dX 
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The value of mC is constant for all scan lines that inter- 
sect a given pair of polygon edges; therefore mC needs 
to be calculated only once for each such pair. Example 
7 assumes that mC has already been calculated for all 
colors, and all that remains is to apply mC to successive 
pixels. Let C(Xn) be a C value at pixel Xn. Then... 


C(X1) = Cl 
C(XI + 1) = C1 + mC 
C(X] + 2) = Cl + 2*mC 


C(X1 + N) = Cl + N*mC 


C(X1 + dX) = Cl + d¥*mC = C(X2) 


Figure 3 illustrates Gouraud shading of a triangle. 


The faddp instruction performs the above calculations 
64 bits at a time. Because a pixel is 16 bits wide, Exam- 
ple 7 operates on pixels in groups of four. Instead of 
starting with the value for the first pixel (C(XJ)) and 
adding mC to each pixel to produce the value for the 
next pixel, the example procedure starts with the values 
for the first four pixels and adds 4*mC to each group of 
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four to produce the values for the next four. Three last pixel of a line segment is not on an 8-byte bounda- 
faddp instructions are executed for each group of four ry, two kinds of special considerations are required: 
pixels. The first increments the blue values; the second, 
green; the third, red. Figure 4 shows one way of con- ae 
structing the operands for each color before starting the 2. Initialization of the accumulators. 
color interpolations. (The initial value given to src] de- 
pends on the alignment of the first pixel.) 


1. Masking of Z values near the end points. 


4.1 Z-Buffer Masking 


When either the first or last pixel of the line segment is 
not at an 8-byte boundary, the rendering procedure 
must mask the first or last set of new Z-buffer values 
(newz) so that the Z-buffer and the frame buffer are not 
erroneously updated. Sometimes both the first and last 


Setup of the accumulator and interpolants is similar to 
that of the Z-buffer. The code in Example 4 constructs 
the interpolants; it needs to be executed only once for 
each pair of edges in each polygon. 


pixels are in the same 4-pixel set, in which case either 
4.0 BOUNDARY CONDITIONS one may not be on an 8-byte boundary. A function that 
The i860 microprocessor operates on 64-bit quantities looks up and calculates masks is shown in Example 5. 


that are aligned on 8-byte boundaries. The code in this 


example takes full advantage of this design, handling Because the value OxFFFF is used for masking, the Z- 
four 16-bit pixels in each loop. However, if the first or buffer is initialized with OxFFFE, so that the fzchks 


instruction always finds the mask to be greater than 
any Z-buffer contents. 


Accumulator 
63 47 31 15 0 
| | | | Initial 
Ci+3*mC | frac Ci+2*mC | frac C1+mC | frac C1 | frac eed 
Interpolant 


63 47 «St 15 0 


l ] 
frac 4*mC | frac. 4*mC | frac 


Figure 4. faddp Operands 


// CONSTRUCT INTERPOLANTS iR, iG, iB GIVEN mR, mG, mB 
shl mR, // Multiply each color slope by four, then 
shl // shift by 16 to put the significant 
shl // bits into the high-order half 
shr Return significant 16 bits 
shr to low-order half. Any sign bits 
shr : | in high-order half are gone. 
or | Join 16-bit quarters 


or | | in 32-bit register 
or 


ixfr Join 32-bit halves 
ixfr in 64-bit register 
ixfr 

fmov.ss 

fmov.ss 

fmov.ss 


Example 4. Construction of Color Interpolants 
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emacro zmask l_align, r= align, Rx, Ry 
// iwalign, r_align - left=- and right-end aii oament [0.63] in. Bivete: units 


// Rx, Ry  ~—”.—C  Seratch registers 
data 
ealign 8 
left_mask:: //low high 
elong 0x00000000, 0x00000000 // 0 mod 4 
elong OxOOOOFFFF, 0x00000000 // 1 mod 4 
slong OxFFFFFFFF, 0x00000000 // 2 mod 4 
elong OxXFFFFFFFF, OxOQOOOFFFF // 3 mod 4 
right_mask:://low | ‘high © | 
elong OXFFFFOO00O, OxFFFFFFFF // 0 mod 4 
slong 0x00000000, OxFFFFFFFF // 1 mod 4 
elong 0x00000000, OxFFFF0000 // 2 mod 4 
elong 0x00000000, 0x00000000 // 3 mod 4 
etext ; 
shl 3, l_align, l_align // Multiply by 8 
mov left_mask, Rx // | 
fld.d lialign (Rx), lZmask // Load 8-byte mask | 
shl 3, rualign, r_align // Multiply by 8 
mov right_mask, Rx // 


fld.d ralign (Rx), rZmask // Load 8-byte mask 
// If the first and last pixels are contained in the same 64-bit 
// aligned set, then 1Zmask = 1Zmask OR rZmask. 


andh Ox8000, dx, r0 // Is aX negative | 
be L2 // If not, right end is in ether set 
fxfr lZamask, Rx // 
fxfr rZmask, Ry // . 
or Rx, Ry, Rx // OR low-order half 
ixfr Rx, lZmask _ // 
fxfr 1Zmaskh, Rx // 
fxfr rZmaskh, Ry // 
or Rx, Ry, Rx // OR high-order } half 
ixfr Rx, 1Zmaskh // 
L2: nop // 


Example 5. Z Mask Procedure 
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Table 2. Accumulator Initial Values 


Z1 — 1*mZ 
Z1 — 2*mZ 
21 — 3*mZ 

Z1 — 4*mZ 


C1 — 1*mC 
C1 — 2*mC 
C1 — 3*mC 
C1 — 4*mC 


C1 — 2*mC 
C1 — 3*mC 
C1 — 4*mC 
C1 — 5*mC 


Z1 — 3*mZ 
Z1 — 4*mZ 
Z1 — 5*mZ 

Z1 — 6*mZ 


C1 — 3*mC 
C1 — 4*mC 
C1 — 5*mC 
Ci — 6*mC 


C1 — 4*mC 
C1 — 5*mC 
C1 — 6*mC 
C1 — 7*mC 


Table 3. Accumulator Initialization Table 


| Table Values 


4.2 Accumulator Initialization 


When the first pixel of the line segment is not at an 8- 
byte boundary, initial values placed in the accumulators 
(aZ, aB, aG, and aR) must be selected so that Z/, 
Red1, Grn1, and Blul correspond to the correct pixel. 
The desired result is that shown by Table 2. However, 
each value is a composite of two terms: one that is 
constant for each edge pair (n*mZ, n*mR, n*mG, 
n*mB) and one that can vary with each scan line (Z/, 
Red1, Grn1, Blu1). The example assumes that the con- 
stant values have all been calculated and stored in a 
memory table of the format shown by Table 3. At the 
beginning of each line segment the values appropriate 
to the alignment of the line segment are retrieved from 
the table and added to the initial Z and color values, as 
shown in Example 6. 


5.0 THE INNER LOOP 


Once the proper preparations have been made, only a 
minimal amount of code is needed to render each scan- 
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line segment of a polygon. The code shown in Example 
7 operates on four pixels in each loop. The left and 
right ends of the line segment go through different logic 
paths so that the Z-buffer masks can be applied by the 
form instruction. All the interior points are handled by 
the tight inner loop. 


The controlling variable dX is zero-relative and is ex- 
pressed as a number of pixels. The value of dX also 
indicates alignment of the end-points with respect to 
the 4-pixel groups. Unaligned left-end pixels are sub- 
tracted from dX before entering the inner loop; there- 
fore, subsequent values of dX indicate the alignment of 
the right end. A value that is 3 mod 4 indicates that the 
right end is aligned, which explains the test for a value 
of —5 near the end of the loop (—5 mod 4 = 3). The 
fact that the value —5 is loaded into register Rb on 
every execution of the loop does not represent a pro- 
gramming inefficiency, because there is nothing else for 
the core unit to do at that point anyway. 
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// ACCUMULATOR INITIALIZATION TABLE 
. data; .align .double 
ace_init_tab:: .double [16] 0 


edsect 
aBi; -double // Four initial 16-bit blue values 
aGis: double  // Four initial 16-bit green values 
aRi: -double // Four initial 16-bit red values 
aZis -double // Two initial 32-bit Z values 

-end 

etext 


// INITIALIZE ACCUMULATORS 
emacro acc_init Lalign, Rtab, Rx, Ry, Fx, Fxh 
// Lalign - left-end alignment (0..3) in two-byte units 


// Rtab - register to use for addressing the table | 
// Rx, Ry, Fx, Fxh = scratch registers 
mov . acc_init_tab, Rtab // 
shl, 5, Lalign, Lalign // Multiply by row width 
adds Lalign, Rtab, Rtab // Index row corresponding to alienmént 
fld.d  aZi(Rtab), aZ 8 ff 
ixfr Zl, Fx. - ff a | 
fld.d aRi(Rtab), aR // R-Load constant values 
shl 16, Redl, Rx // R-Shift starting value to hi-order 
fmov.Ss Fx, Fxh // @ . 
shr 16, Rx, Ry // R-Redl stripped of sign bits 
fiadd.dd Fx, aZ, aZ ff 2 | 
or Rx, Ry, Ry // R-Form (Redl,Redl) 
ixfr Ry, Fx // R=-Put in 64-bit naeerer 
fld.d aGi(Rtab), aG // G 
shl 16,  #£«°Grni, Rx | // G | | | | 
fmov.ss Fx, Fxh // R-Form (Red Red, Red ,Red2) 
shr 16, Rx, Ry // G 
fiadd.dd Fx, . aR, aR // R-Add variables to constants 
or Rx, Ry, Ry // G 
ixfr Ry, Fx // G 
fld.a aBi(Rtab), aB // 8B 
shl 16, Blul, Rx // 8B 
fmov.ss Fx, Fxh | 4/7 G. 
shr | 16, Rx, Ry © // B 
fiadd.dd Fx, aG, aG // G 
or : Rx, Ry, Ry // 8B 
ixfr Ry, Fx // B 
fmov.ss ~*¥Fx, Fxh // B 
fiadd.dd Fx, a@B, . aB // 8B 
-endm 


| Example 6. Accumulator Initialization 
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// 
// 


RENDERING PROCEDURE 


16-bit pixels, 
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16-bit Z-buffer 


and 3, Als Ra // Determine alignment of sStarting-point 
acc_init Ra, Rb, Rc, Rd, Fa, Fah // Initialize accumulators 
subs 4, Ra, Rb // 4 = alignment 
subs ax, Rb, aX // Adjust aX by Xl alignment 
// If aX <= 0, then right end is in same set as left end 
and 3, dx, Rb // Determine alignment of right end 
zmask Ra, Rb, Re, Rd // Prepare both left=- and right-end masks 
left_end:: // Handle boundary conditions 
d.faddz aZ, iZ3, aZ // Interpolate 2 even Z values 
adds -8, FBP, FBP // Anticipate autoincrement 
d.faddz aZ, iZl, aZ // Interpolate 2 odd Z values 
adds -8, GBP, ZBP // Anticipate autoincrement 
d.form lZmask, newz // Mask 4 new Z values 
fld.d 8(ZBP), oldz // Fetch 4 old Z values 
d.faddp aB, iB, aB f// Interpolate 4 blue intensities 
mov ~4, Ra // Loop increment: 4 pixels 
d.faddp aG, iG, aG // Interpolate 4 green intensities 
adds -~4, ax, GX // Prepare dX for bla at end of loop 
d.faddp ak, ik, aR // Interpolate 4 red intensities 
bla Ra, dx, Ll // Initialize LCC 
d.form fO, newi // Move 4 new pixels to 64-bit reg 
adds 5: dx, r0O // Are there any whole sets (dX < <5)? 
Ll: d.fzchks oldz, newzZ, newz// Mark closer points in PM[7..4] 
be | short_segment // Get out now if no whole Set 
ad.fnop f/f 
fld.d 16 (ZBP) , oldz // Fetch 4 old Z values 
inner_loop:: // Handle all interior points 
d.faddz aZ, iZ3, aZ // Interpolate 2 even Z values 
nop // 
d.faddz aZ, iZl, aZ // Interpolate 2 odd Z values 
fst.d newz, 8 ( ZBP) + // Update Z buf from prior loop 
d.form fo, newz // Move 4 new Z values to 64-bit reg 
nop // 
d.fzchks f0, f0O, f0 // Shift PM[7..4] to PM[3S..0] 
mov ~5, Rb // -5 mod 4 = 3, aligned right end 
d.faddp aB, iB, aB // Interpolate 4 blue intensities 
pst.d newi, 8 (FBP) + // Store pixels indicated by PM[3..0] 
d.faddp aG, ig, aG // Interpolate 4 green intensities 
xor Rb, ax, rO // Are we at an aligned right end? 
d.faddp ak, HER, aR // Interpolate 4 red intensities 
be aligned_end // Taken if at an aligned right end 
d.form fo, newi // Move 4 new pixels to 64-bit reg 
bla Ra, @X, imner_loop // Loop if not at end of line segment 
d.fzchks oldz, newz, newz// Mark closer points in PM[7..4] 
fld.d 16(ZBP), oldz // Fetch 4 old Z values for next loop 


// End of inner_loop. 


Right end not aligned 


Example 7. 3-D Rendering (1 of 2) 
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_pight_end:: // Handle boundary conditions 
d.faddz aZ, iZ3, aZ // Interpolate 2 even Z values 
nop / . . 
d.faddz aZ,  iZl, aZ Interpolate 2 odd Z values 
fst.d > newz, 8(ZBP)++ Update Z buf from prior loop 
d.form rZmask, newz Mask 4 new Z values 
nop | | | 
d.fzchks fO, f0O,. fO Shift PM[7..4] to PM[3..0] 
nop | | 
_da.faddp aB, iB, aB Interpolate 4 blue intensities 
pst.d 8 (FBP) ++ Store pixels indicated by PM[3S..0] 
d.faddp —aG, iG, aG Interpolate 4 green intensities 
nop 
d.faddp aR, ik, aR Interpolate 4 red intensities 
nop : 


aligned_end:: // No Special boundary conditions 
ad.form fo, newi // Move 4 new pixels to 64-bit reg 
br wrap_up | // 
a.fzchks oldz, newz, new2z// Mark closer points in PM[7..4] 
nop // | 


short_segment:: | 
d.fnop , : Sf 
adds 8, ax, f/f Is right end in same set as left? 
d.fnop // 
bne.t right_end // Branch taken if no. 
d.fnop // 
fld.d 16(ZBP) , oldz // Fetch 4 old Z values 


wrap_up:: // Store the unstored and leave dual mode. 
 fzchks | fo, fo, fO // Shift PM[7..4] to PM[3..0] 
fst.d newzZ, 8 ( ZBP) ++ // Update Z buf from prior loop 
fnop . 
pst.d newi, 8 (FBP) ++ // Store pixels indicated by PM[S..0] 


Example 7. 3-D Rendering (2 of 2) 
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6.0 ALTERNATIVE IMPLEMENTATIONS 


Example 8 contrasts the inner loop of the 16-bit pixel rendering procedure with that of an 8-bit procedure. For 8-bit 
pixels, two faddp instructions accomplish 64-bits of pixel intensity interpolation; there is no need to maintain three 
separate color accumulators. Four faddz instructions (rather than two) are required, because eight Z values are 
created for the eight pixels per loop. 


// 8=-bit Pixels, 16-Bit Zbuffer = 8 Pixels in 15 Clocks 

// G-Unit Core Unit 
inner_loop:: : 

d.faddz aZ,deltaZl,aZ 

d.faddz aZ,deltaZ2,aZ 

d.form f0,newZ_A 

d.faddz aZ,deltaZl,aZ 

d.faddzz aZ,deltaZ2,aZ 

ad.form f0,newZ_B 

d.fzchks o0l1dZ_A,newZ_A,newZ_A 

ad.fzchks 01dZ_B,newZ_B,newZ_B 

d.faddp intens,dI,intens 

ad.faddp intens,dI2,intens 

fO,newi 


fld.q 16(ZBP) ,oldZ_A 
nop 

nop 

andh 0x8000,daxX, rO 
bne rightend 

nop 

nop 


newZ_A ,16(ZBP) ++ 
0,dX,end 
neg8,dX,inner_loop 
newi,8(FBP) ++ 


r 
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// 16-Bit Pixels, 16-Bit Zbuffer = 4 Pixels in 10 Clocks 
// G-Unit Core Unit 
inner_loop:: 
d.faddz aZ,iz3,aZ 
d.faddz aZ,izl,aZ 
' ad. form fO ,newz 
d.fzchks f0,f0,f0 
( aB,iB,aB 
aG,iG,aG 
aR,iRk,aR 
f0,newi 
oldz,newz,newz 


nop 
newz,8(ZBP) ++ 


-5,Rb 

newi,8 (FBP) ++ 

Rb, aX, r0 | 
aligned_end 
neg4,dX,inner_loop . 
16(ZBP) ,oldz 


we we we we 08 we we we we we 


Example 8. Inner Loop of Renderers for Two Pixel Sizes 
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ABSTRACT 


The i860 Processor computes floating-point results rap- 
idly, lending itself to DSP (digital signal processing) as 
well as general-purpose computing. With this high per- 
formance, DSP functions can be added to any system 
containing an i860 CPU. A Fast Fourier Transform 
(FFT) illustrates this DSP power. Complete code for 
the FFT is presented in this application note, as well as 
performance measurements. Both complex and real in- 
put data FFTs are included, as well as both Decimation 
in Time and Decimation in Frequency. 


1.0 INTRODUCTION TO FAST 
FOURIER TRANSFORMS 


Discrete Fourier Transforms (DFTs) change time-do- 
main data samples into a frequency-domain profile of 
the sampled signal. The frequency-domain representa- 
tion consists of the magnitudes of sine waves at various 
frequencies, which would recreate the original data if 
superimposed. To accomplish the transform, a DFT 
adds combinations of the input data samples, after mul- 
tiplying some of those inputs with weighting factors. 
The number of samples, “‘N’’, is usually a power of two. 


Each result in the frequency domain comes from a 
weighted sum of all data samples. The weighting (““W’’) 
factors are called “‘twiddles’’, and are complex cosine/ 
sine values for each particular frequency. 


The FFT (Fast Fourier Transform) is an efficient im- 
plementation of the DFT, defined by: 


x(n) = time domain samples of the signal, 
n= 0,1,...N-1 


X(k) = the Discrete Fourier Transform of x(n), k = 


0,1,...N-1 
= a “frequency domain” equivalent of x(n) 


= ¥ x(n)* Wak, n = 0 to N-l, and 
Wok = ej27nk/N , where j = /—1 


= x(n) * (cos(2arnk/N) — j * sinQarnk/N)) 


The (N-1) complex adds and (N-1) complex multiplica- 
tions required for each X(k) make the DFT an Order 
(N2) computation. Fortunately, the FFT decomposes 
this to an Order (N * logz N) algorithm by splitting the 
N-sum into units of 2-sums. These units are called 
“butterflies” because they produce 2 output values 
from 2 inputs, with the butterfly-shaped dataflow 
shown below. (Some FFT algorithms, called Radix-4, 
use 4-input, 4-output butterflies.) The butterfly calcula- 
tions are executed in stages, with logy N stages and N/2 
butterflies per stage. | 
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The subdivision, or decimation, of the N-sum into but- 
terflies can be done via two different methods: “‘Deci- 
mation in Time” (DIT) or “Decimation in Frequency” 
(DIF). The methods differ in the ordering of twiddles 
and the form of the butterfly arithmetic, but they yield 
the same answer. They are based on different mathe- 
matical derivations of the FFT: DIT results from recur- 
sively splitting the input time-domain samples into an 
even-indexed group and an odd-indexed, while DIF 
comes from splitting the DFT output frequency-do- 
main points into odd/even groups. 


2.0 BUTTERFLY DEFINED 


Let A = the first input to the butterfly (complex 
number, composed of Real part AR and 
Imaginary part AI) 


B = the second input to the butterfly (com- 
plex, BR and BI) 


W = twiddle factor (also complex, WR and 
WI) 3 


Anew = complex result #1, which overwrites A 
Bnew = result #2, which overwrites B 


For a “Decimation-in-Frequency”’ butterfly, 
Anew = A+B 
Bnew = (A — B)* W 


The complex add, subtract, and multiply of a butterfly 
decompose into 4 real multiplies, 3 real adds, and 3 real 


‘subtracts: 


AnewR = AR + BR_ tempR = AR-BR 
Anew! = AI + BI tempI = AI-BI 


BnewR = (tempR * WR) — (tempI * WI) 
Bnewl = (tempR * WI) + (tempI * WR) 


For a “Decimation-in-Time”’ butterfly, 
Anew = A+ (B* W) 
Bnew = A — (B* W) 


The number of real operations remains 4 multiplies and 
6 add/subtracts, but the equations differ and the multi- 
plies must be done first: 


tempR = (WR * BR) — (WI * BI 


tempI = (WR * BI) + (WI * BR) 
AnewR = AR + tempR BnewR = AR-tempR 
Anewl = AI + templ Bnew! = AlI-templ 
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Butterfly Dataflow: 


(Decimation in Frequency) (Decimation in Time) 
A Anew =A+B A Anew = A+ (B°W) 
ow *W 
B Byew = (A= B) °w B BNEW =A-~- (B*W) 
240658-1 


The stages, twiddles, and butterflies for 8-point FFTs stages. Refer to a text on Digital Signal Processing for a 
are shown in Figures 1 and 2. For larger values of N, complete discussion of FFT design, such as chapter 6 of 
the dataflow patterns are very similar, with N/2 butter- Theory and Application of Digital Signal Processing (see 
flies executed at each stage, and a greater number of the Bibliography at the end of this note). 
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Figure 2. Decimation-In-Time FFT for 8 points 
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3.0 BIT REVERSAL 


Due to their structure, FFT algorithms have the side- 


effect of scrambling the ordering of output data. For 
radix-2 FFTs, the output is in “bit-reversed”’ order— 
for example, the value for frequency one is NOT at 
location one in the output array, but at location N/2. 
Time to unscramble the output is often NOT included 
in FFT benchmarking, because scrambled output is fine 
for some signal-processing uses such as convolution. In 
any event, unscrambling consists of swapping the loca- 
tions of pairs of output values. Alternatively, input val- 
ues can be shuffled, as Decimation in Time usually does 
before the first stage (as shown in Figure 2). Otherwise, 
to avoid the shuffling of input in DIT, the twiddles 
must be accessed in bit-reversed order. As an example 
of bit-reversal, for 256 points the reordering involves: 


SWAP X(i) and X(j), where i = *klmnopqr’b and j = 
*rqponmlk’b. The second index (j) contains the same 
bits as (1), but in opposite order. 
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4.0 FFT IMPLEMENTATION ON THE - 
~ 1860 CPU © 


Several fentires of. the i860 CPU contribute to FFT 
performance. The floating-point multiplier and adder 
can simultaneously produce 1 product and 1 sum per 
cycle, using Dual-Operation FP instructions. To fetch 
the butterfly inputs and store outputs, Dual-Instruc- 
tion-Mode allows a memory fetch or store simultaneous 
with the multiply and add. Four floating-point numbers - 
can be stored by one instruction, using the 16-byte-op- 
erand “fst.q” instruction. Likewise, 16 bytes can be 
fetched from the data cache in one fld. q Op. 


The floating- -point arithmetic of the i860 cpu con- 
forms to IEEE 754 format, which some DSPs fail to do. 
Shown below is code for the crucial inner loop of the 
FFT: | 


//inner_loop: do 2 Decimation-In-Frequency FFT butterflies. _ 
// Twelve clocks for 2 butterflies - 12 FP BOGE euns 8 multiplies, 


// 6 8=byte loads, 4 8- -byte stores. 
// a 
inner_loop: 
d.r2pt.ss 
d.pfsub.ss 
d.ratls2.ss 
d.i2st.ss 
d.ratlp2.ss 
d.ialp2.ss 


Core-op 


wind (wstart) ,WRo 
8 (fetch)++,ARo 
offset (fetch) ,BRo 
AnewR,16 (store) ++ 
wincr,wind,wind 
wind (wstart) ,WR 


pfid.d 
fld.d 
flid.d 
fst.q 
adds 
pfld.d 


WR,DI,BnewR 
AR,BR,AnewRo 
AI,BI,Anewlo 
WI,DR,Bnewl 
AR,BR,DR 
AI,BI,DI 


we we we wo we we 


WRo,DI,BnewRo 
ARo, BRo, AnewR 
Alo,Blo,Anewl . 
WIo,DR,Bnewlo 
ARo,BRo,DR 
AIo,BIo,DI 


adds 


wincr,wind,wind 
fld.d 8 (fetch)++,AR _ 


d.pfsub.ss 
d.ratls2.ss 
d.i2st.ss 
d.ratlp2.ss 
d.ialp2.ss 


fld.d offset (fetch) ,BR 

fst.q BnewR, offset (store) 

bla decrem,count,inner_loop. 
and wlimit,wind,wind //modulo. 


we we we Ce wo we 
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5.0 CODE DESIGN 


Refer to the inner__loop above and code listings at the 
end of this application note for the discussions that fol- 
low. Refer to the “i8607™M 64-bit Microprocessor Pro- 
grammer’s Reference Manual’ (Intel order number 
240329) for details on instructions and formats. 


The programs include both assembly and Fortran com- 
ponents. Input data can number any power of 2 from 
16 to 1024 points. The algorithms are radix-2, floating- 
point, in-place. Included in the listing are both Decima- 
tion-in-Time and Frequency, and both complex-input 
and real-input FFTs. 


5.1 Cache Utilization 


Because the instruction cache contains 4-Kbytes, all re- 
quired code easily fits in cache. However, a 1024-point 
complex FFT fills the 8-Kbyte data cache with the in- 
put X() array. Thus the more rarely-used twiddle W() 
array is intentionally kept out of cache, as described in 
the “‘pfld’’ section. 


A subroutine (‘“‘fetch.ss’’) is used to move the input data 
array efficiently into cache for the 1024-point FFT. 
“Fetch” allows all data to be brought into cache using 
the next-near (NENE#) accesses to DRAM. Without 
that routine, getting A and B from locations separated 
by 4 Kbytes (NOT the same DRAM page) makes 
fetches and writebacks from DRAM for the first stage 
slower, and adds 30% to overall execution time. 


For larger FFTs (2048 points = 16 kB), straightfor- 
ward expansion of the present algorithm would cause 
increased cache misses. Thus a larger FFT should be 
broken into multiple FFTs of 1024 points so that all 10 
stages of each can achieve high cache hits. The algo- 


rithm becomes (assuming 2048 points, Decimation-In- 


Time): 
1) Bit-reverse the entire input array 
2) Do a 10-stage FFT on the second set of 1024 points. 


Cache hits should be high on those, since they were 
most recently accessed by the bit-reversal. 


3) Do a 10-stage FFT on the first 1024 points. Prefetch 
before the first stage to ensure cache hits. 


4) Combine the 2 separate 1024-point results with a fi- 
nal stage of butterflies, where A is offset from B by 
8 Kbytes. | 
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5.2 Pfid 


Twiddle factors (W) are fetched with pfld (Pipelined 
Floating-Point Load), to avoid caching them. Only in 
the first stage are all the W() elements used; successive 
stages use fewer and fewer elements, which are separat- 
ed by larger and larger strides. Thus placing W() in 
cache would be inefficient. The streaming of W() from 
main memory actually yields better performance than 
caching W(), for 512 and 1024 points. With the i860 
CPU’s 8-byte external data bus, a complex W() value 
can be transferred in a single bus cycle. Some FFT rou- 
tines calculate W() on the fly, rather than fetching pre- 
calculated values; however, performance decreases due 
to the added run-time calculations. 


5.3 Fst.q 


Quad-word (16-byte) stores allow 4 floating-point regis- 
ter values to update the cache in one cycle. Likewise, 
fld.q (Quad Floating Point Load) transfers 4 values to 
the registers in a cycle. However, in some FFT stages, 
double-word fetches (fld.d) are used instead of fld.q; 
that allows the “background”’ fetch of a set of operands 
concurrent with arithmetic on the other set. For the 
same reason, the inner loop does 2 butterflies, rather 
than one. 


5.4 Bit Reversal Code 


The code for bit-reversal fetches the indices of 2 ele- 


“ments to be swapped from a pre-allocated array of indi- 


ces, and swaps the data elements. Again, pfld.d keeps 
the indices out of cache, for the 1024 point case. That 
assembly version of bit-reversal 1s approximately 7 
times faster than the standard Fortran routine. The ar- 
ray of indices was generated by printing out the values 
generated during operation of the standard Fortran ver- 
sion; similarly, the twiddle W() values can be pre-allo- 
cated and generated using a high-level- language pro- 
gram. : 


6.0 PIPELINE SCHEDULING 


The adder pipeline is 3 stages, as is the multiplier; for 
the calculation of 


BnewR = (AR — BR) * WR — (Al — Bi) * WI 


the adder result is fed back into the multiplier, and the 
product again feeds into the adder. The adder and mul- 
tiplier pipes each advance one stage for each floating- 
point instruction issued. 


intel 


The butterfly decomposes into 6 real add/subtracts and 
4 real multiplies. Thus the best possible performance 
would be 6 clocks per butterfly, with the multiplies to- 
tally overlapping the adds. The overlap is accomplished 
with the Dual-Operation instructions: 


r2pt (KR*src2, Treg+ Mout, load KR <— srcl) 
ratls2 (KR*Aout, srcl-src2, load T <— Mout) 
i2st. (KI *src2, Treg-Mout, load KI < srcl) 
ratIp2 (KR*Aout, srcl + src2, load T <— Mout) 
ialp2 (KI*Aout, srcl + src2, load KI <— srcl) 


KR, KI, and T are operand registers feeding the multi- | 


plier and adder, separate from the floating-point regis- 
ter file. They permit the 4 inputs for multiply and add, 
even thought the instruction format holds only 2 regis- 
ters. “Aout” and “Mout” are adder and multiplier out- 
_ puts. 


The data path arrangements of some of these ops are 
illustrated in Figures 3 and 4. Fetching and storing of 
butterfly operands is overlapped with the calculations, 
using Dual Instruction Mode — the integer core op 
(such as a load or branch) and FP op are fetched simul- 
taneously from the instruction cache and executed 
simultaneously. - _ 


Scheduling of instructions was done with a pipeline dia- 
gram, as illustrated in the comments of the code listing 


srct sr¢o2 rdest 


opt , Op2. 


MULTIPLIER UNIT 


RESULT . 


op1 op2 
ADDER UNIT 
RESULT 


r2pt & r2st 
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Figure 3. Datapath for r2pt op 
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of difstep.ss in the Appendix. (The comments show the 
machine state after the instruction is processed.) Begin 
by placing the desired results in the rightmost column, 
then tracing progress backwards through the adder. 
When adder inputs are products (of the multiplier), one 
product is kept in the Treg for a cycle while the other 
propogates through the multiplier final: stage. Those 
products can be traced back on the multiplier pipeline, 
to determine at what instruction the multiplier inputs 
must be provided. | bs 


For example, place the BnewR label in the “Write” 
stage of the pipe (the output of the Adder). Now 


BnewR = WR * DR — Wi * Di 


Three instructions earlier, the adder inputs for BnewR 
must be fed to adder; those inputs are products, one of 
which comes directly from the multiplier output, and 


the other from the Treg. The multiplier output and 


Treg value must then be traced back through multiplier 
stages, requiring the following instructions: | 


i2st.ss WIo,DR,Bnewlo as the 10th op of 12, to start (T — Mout) 
ratls2.ss Alo,Blo,Anewl as the 9th instruction, to update the Treg 
ialp2.ss | AI,BI,DI as the 6th op, to multiply DI * WI 
ratip2.ss AR,BR,DR 


ratis2.ss AI,BI,Anewlo 


as the Sth op, to multiply DR * WR 
as the 3rd, to start DI into the adder 
pfsub.ss AR,BR,AnewRo as the 2nd, to start DR into the adder 


sre2 rdest 


opt op2 
’ MULTIPLIER UNIT 
RESULT 


_ opt - op2 
ADDER UNIT 


RESULT: 


ratip2 & ratis2 
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Figure 4. Datapath for ratip2 op | 
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Some trial-and-error ordering of the desired outputs is 
needed to devise a sequence which keeps the adder 
pipeline full. An op is chosen for each slot for its ability 
to load the KR or KI register, or to initiate an adder 
operation simultaneous with the multiplies required to 
calculate BnewR and Bnewl. 


Handy hints to assist dual-operation scheduling in- 
clude: 


1) Feedback the adder result to the multiplier, or visa 


versa, whenever possible. For example, the ratlp2_ 


op feeds adder-out to multiplier. Thus both srcl and 
src2 fields of the instruction are available to feed the 
adder-in, and a simultaneous useful add and multi- 
ply are initiated. 


2) Freeze one of the pipes, by using a pfadd or pfmul, 
when appropriate. In the butterfly, where 6 adds are 
done for every 4 multiplies, freezing of the multipli- 
er does not degrade performance. The freeze allows 
multiplier results to be held until needed in the ad- 
der. : 


3) The Treg can hold a multiplier result for several 
cycles until needed in the adder. 


4) Unroll a loop to do 2 iterations per loop. That pro- 
vides time to fetch inputs for iteration 2 while calcu- 
lating iteration 1, and store results of iteration 1 
(and fetch more inputs) while calculting iteration 2. 


7.0 PERFORMANCE MEASUREMENTS 


The code was run on an evaluation card with DRAM 
memory only, no external cache, 33.33 MHz clock, and 
5 wait-states or more for some accesses. Next-near ac- 
cesses (address falls into the same DRAM page as the 
previous access) are zero wait-state, but far accesses 
take 5 or more wait-states. The code was run under a 
virtual-memory multitasking executive. Shown below 
are measured results: 7 


System: 33.3 MHz 80860 with a single bank of 
static-column DRAM 


Algorithm: Radix-2 FFT, in-place. Data is IEEE 754 
single-precision floating point. Implemented in assem- 
bly-language and Fortran code. 
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Time 
(including 
bit-reversal) 


Type of FFT 


1024-point-complex, DIF 
1024-point-real 
512-point-complex, DIF 
512-point-real 
256-point-complex, DIF 
1024-point-complex, DIT 
512-point-complex, DIT 


7.1 Cache Fill and Writeback Time 


Measured times do not include cache-fill and write- 
back. That is, the timings measured 200,000 executions 
of the FFT using the same input array. (Performance 
figures offered by other manufacturers for DSP chips 
likewise assume that the data is already in on-chip 


‘RAM. Of course, the i860 CPU will do that fetching 


automatically into its data cache.) The additional time 
for cache fill and writeback were measured as: 


1024-point-complex 0.25 ms_ (8 Kbytes fetched, 
8 Kbytes writeback) 


512-point-complex 0.12 ms (4 Kbytes) 


To quantify the calculations in MFlops (Millions of 
FLoating-point OPerations per Second), consider that 
the 1024-point complex FFT is implemented with 


about 16,400 multiplies and 28,700 adds/subtracts. 


Thus the 1.17 ms translates to a sustained 38.5 MFlops 
rate. For 512 points, the required 20,000 Flops means 
41.6 MFlops. . 


The overall FFT is about 10 times faster than the equiv- 
alent Fortran. Inner loop performance was measured at . 
13 cycles for the 24 instructions, which is 6.5 cycles per 
butterfly. 
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8.0 CODE HIERARCHY 
Pictured below are the programs developed for the i860 CPU FFT: 


ffttest.f 


bitrev.ss 


fetch 


The Fortran program ffttest.f is the highest-level pro- 
gram of those listed on the following pages. It calls two 
FFT subroutines, diff.f and fft.f, then compares their 
outputs. Fft.f is a Fortran decimation-in-time algo- 
rithm, while diff.f is the high-speed DIF routine. Diff.f 
is callable by C or Fortran applications. It in turn calls 
difstep, which is implemented in assembly code 
(difstep.ss). Difstep is called once per stage of the FFT. 


A Fortran version (difstepf.f) is shown, for comparison. 


Other assembly routines are the bit-reversal-data-move- 
ment (bitrev.ss) and prefetch (“fetch” inside bitrev.ss). 


Difstep.ss contains approximately 225 assembly in- 
structions, and bitrev.ss contains about 24. The Fortran 
diff.f compiles to about 80 instructions. 


A Decimation-in-Time version of diff.f and difstep.ss 
can be found in ditt.f and ditstep.ss. The DIT version 
performs 5-10% slower than the Decimation-in-Fre- 
quency because the DIT loop takes 7 cycles per butter- 
fly, while DIF takes 6. 


A real-input algorithm is dirr.f, which can be called 
and tested using program real.f. Dirr.f calls difstep to 
do a complex. DIF FFT on N real data points, but 
treats them as N/2 complex points. Then realfix.ss is 
called by dirr.f to fix the DIF output, compensating for 
the treatment of the N real points as N/2 complex. The 
derivation of the real-fix can be found in reference 3, 
Numerical Recipes in C. 


The mixture of Fortran, C, and assembly code is ac- 
complished by passing function inputs and outputs in 
registers. Only pointers and integer values were used in 
the above code, but floating point parameters can also 
be exchanged. A calling program feeds arguments to a 
function in r16, r17, and higher-numbered integer reg- 
isters. The callee is permitted to destroy the contents of 
those registers, but rl:r15 must be preserved. For more 
details on parameter-passing conventions see the i860 
64-bit Microprocessor Programmer’s Reference Manual, 
Chapter 8. 


difstep.ss 


realfix.ss 
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9.0 CONCLUSION 


The i860 CPU computes very Fast Fourier Transforms, 
quicker than most high-end dedicated DSP chips. Con- 
tributing to the FFT performance are the 8-kByte on- 
chip data cache and 4-kByte instruction cache. Also the 
8-byte external data bus, pfld instruction, and 16-byte 
data cache width provide sufficient bandwidth to keep 
the arithmetic units busy. Dual-Operation instructions 
and Dual-Instruction-Mode allow parallel data move- 
ment and calculations. The 33.3 MHz clock rate allows 
both an add and a multiply every 30 ns, giving a time of 
1.17 ms for a 1024-point complex FFT. A 40 MHz i860 _ 
Microprocessor will yield a time of less than 1 mSec. 
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APPENDIX A 
PROGRAM LISTINGS 


1) diff-f: 
Fortran module to do fast Devimation-In-Frequency (DIF) Radix-2 FFT. 
2) difstep.ss: 
Assembly code wach does all DIF FFT butterflies; called by diff-f. 
3) difstepf-f: 
Fortran equivalent of difstep.ss. Included here for clarity. 
4) bitrev.ss: | 
Assembly code to do bit-reversal. 
5) ffttest.f: 
Highest-level Fortran code. Tests diff.f or ditt-f. 
6) ditt.f: 
Fortran module to do fast Decimation-In-Time (DIT) Radix-2 FFT. 
7) ditstep.ss: 
Assembly code which does all DIT FFT butterflies; called by ditt-f. 
8) dirr.f: 
Fortran module for Real- Input SCS USE PAD cy (DIF) Radix-2 FFT. 
9) realfix.ss: 
Assembly code required by dirr.f to compensate for Real-Input. 
10) real.f: 
Highest-level Fortran code, for Real-value input. Tests dirr.f. 
11) fft.f: 
Fortran FFT algorithm. Generates “correct” answers for comparison against the other code. 
12) makefile: - 


Unix V/386 version of a makefile to maintain the FFT code, using the Unix ‘ ‘make” program-mainte- 
nance utility. Note that this makefile uses the Unix macro preprocessor ‘“‘m4” to convert symbolic names 
to register numbers. 


13) start.ss: 

Assembly code preamble for Fortran runtime. 
14) time.c: | 

Dummy routine, used to install breakpoints. 
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C 
C File: diff.f 
C FFI - Decimation in Freq, radix-2, inplace, L- dimen 


C Intel assumes no roepensthiiiey for use or Gisece of this code. 


C 5/19/89: call fetch8() added for 1024=-point caching. 
C 6/01/89: fetch() CRUCIAL-30% performance loss if removed 


Inputs: 
A= complex array of input, up to 1024 pts, single-prec float 
M= log of number of pts | 
= (mumber of stages of FFT) 
N = number of points. ie, N= 2**M = number of pts 
W= complex array of twiddle factors, length N/2. 
REV= 0 if bitreversed output ok. l=must re-order output 


Outputs: 
A= complex fft of input A 


aA QA aaaaanaanca 


Subroutine diff(a,m,N,W, REV) 
integer m,N, i, j,k, REV,wlimit 

_ integer offset, stage, groups, wincr, Sowerasies 10) 
complex a(n),w(N/2),temp 


data powers2 /1,2,4,8,16,32,64,128,256,512,1024/ 
C Powers2 to avoid calls to POW, DIV. 


C Twiddle factor array w(k) has (cos,-Ssin) of 2pi*k/N 
CC Assume the caller provides w(k) constants ALREADY initialized. 


C Pre-touch data, lock into cache, for 8kByte fft: 
IF (N .gt. 513) THEN 
call fetch(a,ZVAL(n) ) 


wlimit = 8*((N/2) = 1) 


C "DO 20" stage-loop . 
DO 20 stage = l,m : 
groups = powers2(stage-1) 
C groupsS=number of times the twiddle factors are used, ie, the number of . 
C smaller DFTs the stage is split into. 


C offset gets N/2,N/4,N/8,N/16,... 
offset = powers2(m-stage) 
wincr = groups 
call difstep(a,w,groups,offset,wincr,wlimit) 

20 CONTINUE 


| IF (REV .ne. 0) THEN 

cc REV .ne. O means must do Hiteneyersal reordering of output 
call bitrev(a,%VAL(M) ,n) 
ENDIF 


RETURN 
END 
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// difstep.ss: do one stage of fft butterflies 

// DIF = Decimation in Frequency, radix-2, inplace, l-dimension 

// (C) Copyright 1989 INTEL Corporation. 

// Inner loop developed with assistance from Tricord Systems, Inc. 


// 5/18/89: 1 pm - offset_2 added, aS next-to-last Stage was Slow 
// 5/19/89: 4 pm - fetch8() routine added, for cache miss avoidance. 
// 5/31/89: am = use fst.q (13% perf improvement of inner _loop!) 

// last_bfly added, for performance. 

// 6/02/89: am = bptr deleted. Modulo-address W (5% perf improved) 


// Do one entire stage (n/2 butterflies). Sample invocation: 
// call difstep(a,w,groups,offset,wincr,wlimit) 


// Inputs: 
// A= complex array of input, sSingle-prec float 


// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) is 

// CMPLX(coS(2pi*k/N)),-sin(2pi*k/N)) for k=0 to (N/2)- 

// offset = distance (except for scale=-by-8byte sizeof(complex)) between 
// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 


// groups = N/(2*offset). The number of Sub-DFTs this stage is Split into. 
// wincr = distance (except for scale=-by-8byte sizeof(complex)) between 


// successive w values for successive butterflies 
// wlimit =max index, in bytes, of W table. 
// 


// Outputs: 
// A= complex nadie butterflied version of input. 


define(astart, rl6) j/input data base address 
define(wstart,rl7) //twiddle array ptr. Because w-contents depend on N, 
// we will assume the caller has initialized w() array. 
define(groups,rl8) //groupS=number of Sub=-DFTs this stage is Split into. 
define(offset,rl9) //offset (initially elements, mult by 8 to get bytes) 
// between node and its dual (the 2 numbers to butterfly, ie. A and B) 
define(wincr,r20) //increment between successive W values. Remains constant 
// within a given Stage. For Decimation in Freq, wincr addressing is: 

// +8 for offset=N/2 (WO,W1,W2,W3S,...W(n=1) ) 

// +16 offset=N/4 (WO, We, W4, «ee ) Otc... 
define (wlimit,r2l).//max index, in bytes, of W table. 
define (wind,r22) //current index, in bytes, of W table. 
define(offset2,r23) //offset*2 


define (decrem,r24) //bla decrement 
define (Somecount,r25) // bla counter 


define(FEtch, r26) //pointer to lst component of butterfly (load) 
define(STore,r27) // " " lst component of butterfly (store) 
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// f4:f7 spare 
define(AR, £12) 
define (AI, f13) 
define (ARo, f14) 
define (AIo,f15) 
define(BR, f16) 
define(BI, f17) 
define (BRo,f18) 
define (Blo, f19) 


define(ER, f20) 
define(EI, f21) 
define (ERo, f22) 
define (EIo, f23) 


define(FR, f24) 
define(FI, £25) 
define (FRo, f26) 
define (FIo, £27) 


define(DR, f28) 
define(DI, f29) 
define (WR, £30) 
define(WI, £31) 
define (WRo, f10) 
define (WIo, fll) 


text . 
eAlign .quad 
-difstep_:: 
ld.l 
ld.l 
shi 
shl 


fst.q 
fst.q 


O(groups), 
 O(offset) offset // 

3,offset,offset // pee from elements to bytes 
1, OFTSets EAaOve 
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//element A, real component 


//-" ", imag 
// extra A value, for prefetch (o="odd") 


//element B, real component 

// extra B value, for prefetch 
real (ER = 
imag " 


real, previous loop's value 
imag " 


AR + BR) 


//W*(A=-B), real 
pr imag " 


//Difference of A-B, real part 

//" ", imag" 

//W (twiddle factor), real part 

//" " , imag 

//W (twiddle factor), real sant (EXTRA copy) 
//"" , imag 


groups //fix Fortran sOgteeEe ee 


f8 ,-16(Sp)++ //Ssave "local" regs 
fl2, -16 (Sp) ++ | oe 


-l,groups,groups // cuaceonelent for bne uSage, or bla usage 
-16,r0,decrem //bla decrement 


adds 
adds 


// We code the last 2 stages as special cases: 

| |--n---- - : 
xor 8,offset,rO //offset=l, special case, no complex mult, funny addressing 
beoffset_1// (ASSUMING offset=1 means wincr=0, and no twiddle used) 


xor 16,offset,r0 //offset=2, special case, no complex mult, funny addressing 
beoffset _~2// (ASSUMING offset=2 means wincr=N/4) 

| |------- - 

ld.l O(wincr) ,wincr 

ld.l O(wlimit) ,wlimit 
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pfadd.ss f0,f0,f0 
pfadd.ss f0,f0,f0 
pfadd.ss f0,f0,f0 // init Al,A2,A3=0 
pfmul.ss f0,f0,f0 
pfmul.ss f0,f0,f0 
pfmul.ss f0,f0,f0 


| [------- - 
// init pointers: 
shl 3,wincr,winer //scale for bytes. 
shl l,wincr,wind /f/init wind =2*wincr 


pfld.d O ( wstart) ,f0 
pfld.d wincr ( wstart),f0 


adds -8,astart,FEtch 
pfld.d wind (wstart) ,f0 
adds wincr,wind,wind //wind now 3*wincr 


// here fetch first set of A,B,W before bla=loop 

pfld.d wind (wstart) ,WR 

adds wincr,wind,wind 

and wlimit,wind,wind //modulo-wlimit the w index 

// We do modulo-addressing on W(), to keep the pfld pipeline full. We 
// never do a W=fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W() factors. 


fld.d 8 (FEtch)++,AR . 

fld.d offset (FEtch) ,BR 
d.rgapl.ss f0,f0,f0 //clear Treg. 

adds -52,offset,Ssomecount // bla counter (predecrement by 4 elements) 
ee 
// Definitions for pipe diagram: 
// (the complex multiply product, F, broken into 4 real mult and 2 adds): 


// WR = cos(), WI==sin(). 

// DR = AR = BR; (diffence of Real components of A,B) 
// DI = AI = BI; (diffence of Imag components) 

// ER = AR + BR; EI = AI + BI; 

// FR= K =~ L; where K= WR*DR, L=WI*DI 

// FI =N + M3 where M= WI*DR, N=WR*DI 


// For lst time thru inner_loop, don't have correct values to store. 
// Must do 1 loop before the loop, sans the Stores. 


first_bfly:: //fill pipe 
| // KR. KI... oMl. 0. eM2.. 2 .MS T Al....A2....A5....Write 

d.r2pt.ss WR,f0,f0 // WRO = 

pfld.d wind (wstart) ,WRo 
d.pfsub.ss AR,BR,f0 // - - - - DRO - - 

fld.d 8 (FEtch)++,ARo 
d.ratls2.ss AI,BI,fO //. - - - - DIO DRO ~ - 
fld.d offset (FEtch) ,BRo 
d.i2st.ss WI,f0,f0 // WIO0 = - - “ - DIO DRO - 
adds wincr, wind, wind 
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d.ratlp2.ss AR,BR,DR 
nop . 
d.ialp2.ss AI,BI,DI // LO KO .= © EIO ERO - DIO. 
pfld.d wind (wstart) ,WR . | 
d.r2pt.ss WRo,DI,f0 // WRl - NO LO KO - = EIO ERO - 
‘fld.d 8 (FEtch)++,AR | 
d.pfsub.ss ARo,BRo,ER // NO LO KO - DRL = EIO ERO 
fld.d offset (FEtch) ,BR 
d.,ratls2.ss AIo,BIo,EI // - _ NO LO KO DIl DRl = EIO 
adds - winer,wind,wind 7 
d.i2st.ss WIo,DR,f0 // WIl MO = NO KO K-L DI1l DRL = 
and wlimit ,wind,wind ; 


// 


quickstart:: 


d.ratlp2.ss ARo,BRo,DR // Kl MO ~ NO ER1L FRO DIl DR1 
bla decrem,Somecount,inner_loop //init LCC : 
d.ialp2.ss AIo,BIo,DI // Ll Kl MO .NO- EI1 ER1_ FRO DIl 
adds -16,astart,STore // ptrs init 16 low, for fst.q instructions 

| |------------------ - 

// Each butterfly | = 1 complx multiply, 1 complx add, 1 es a subtract 

If = 4 multiply, 

// 3 add 

// 35 subtract 

// 3 8=-byte fetches ‘the B, W) 


2 8=byte stores (A, B) 
6 cycles per butterfly 


inner_loop: iterates “offset/2" times (eg, N/4 for stage 1, N/8 for Stage2), 
for each group. It does 2 butterflies per iteration 


inner_loop:: 
// KR...KI...Ml..sM2. «M3 T : Al. eA2..-AS. Write 


// | RP a | | | Le | 
d.r2pt.ss WR,DI,FR // WRe - Nl bl Kl NO N+M EI1  ER1L FRO 
pfld.d wind (wstart) ,WRo | 
d.pfsub.ss AR,BR,ERo // , Nl il Kl NO DR2 FIO EIl ER1 
fld.d 8 (FEtch)++,ARo 3 
d.ratls2.ss AI,BI,EIo // oo Nl Ll Kl DI2 DR2 FIO EIl 
fld.d offset (FEtch) ,BRo | | 
d.i2st.ss WI,DR,FI // . WI2 Ml - Nl Kl K-L DI2 DR2 FIO 
fst.q ER,16(STore)++ //update ER/EI/ERo/EIo os | 
d.ratlp2.ss AR,BR,DR // K2 Ml - Nl ER2 FR1 DI2 DR2 
adds wincr,wind,wind 
d.ialp2.ss AI,BI,DI // L2 K2 Ml Nl  EI2 ER2 FR1 DI2 


//no need for modulo-check ("and") here, aS odd num of W's have been fetched. 
pfld.d wind (wstart) ,WR | 


[[Loverecccrccvecccrvccccvccecccceecccee ee eee cerns eee eeeeceeeceeeeee 
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// KRieecKloc co Mle 660M sc0cMS T Al...-A2....A3..--Write 


d.r2pt.ss WRo,DI,FRo // WR3 - N2 L2 K2 Nl N+M EI2 ER2 FRL 
adds wincr,wind,wind 

d.pfsub.ss ARo,BRo,ER// N2 L2 K2 Nl DR3S FIl EI2 ER2 
fld.d 8 (FEtch)++,AR 

d.ratls2.ss AIo,BIo,EI// ~ N2 L2 K2 DI3 DR3 FIl EI2 
fld.d offset (FEtch) ,BR 

d.i2st.ss WIo,DR,FIo// WI3 M2 - N2 K2 K=L DI3 DR3 FIl 


fst.q FR, offset (STore) 
//update FR/FI/FRo/FIo . 
d.ratlp2.ss ARo,BRo,DR// K3 M2 ~ N2 ERS FR2 DI3 DRS 


bla decrem,Somecount, inner_loop 
d.ialp2.ss AIo,BIo,DI// L3 K3 M2 N2 EI3 ER3 FR2 DI3 
and wlimit,wind,wind //modulo. 


end_inner_loop:: //KEEP Pipelines full 

// RE-init pointers for fetches 
d.fiadd.ss f0,f0,f0 

adds offset2,astart,astart //bump to next group 

//redo A,B fetches, with proper ptr. 

d.fiadd.ss f0,f0,f0 

flid.d O(astart) ,AR //get first AR/AI in next group 
d.fiadd.ss f0,f0,f0 

fld.d offset (astart) ,BR 
d.fiadd.ss f0,f0,f0 

adds O0,astart,FEtch 


last_bfly:: //do final 2 butterflies, start next group . 
// KR. eKI. eo eMl. ee -M2e 0 0 MS T Al...-A2....A3.--- Write 


d.r2épt.ss WR,DI,FR // WR4 = N3 L3 K3 N2 N+M EI3 ER3 FR2 
pfld.d wind (wstart) ,WRo 

d.pfsub.ss AR,BR,ERo // NS £43 K3 N2 DR4 FI2 E13 ERS 
fld.d 8 (FEtch)++,ARo 

d.ratls2.ss AI,BI,EIo// - N3 L3 K3 DI4 DR4 FI2 EIS 
fld.d offset (FEtch) ,BRo- 

d.igst.ss WI,DR,FI // WI4 M3. - N3 K3 K=-L DI4 # £DR4 FI2 
fst.q ER,16(STore) ++ 

d.ratlp2.ss AR,BR,DR // K4 M3 - N3 ER4 FR3 DI4 DR4 
adds wincr,wind,wind | "3 

d.ialp2.ss AI,BI,DI // L4 K4 M3 N3 EI4 ER4 FR3 DI4 


pfld.d wind (wstart) ,WR 
LIS REEVEREEE. CRED CR RAAGWEMG OCR CERLAH EON EREODREERCOR SCE OABOOAD EN OW OS 
// KR..KI...Ml....M2....MS - Al...-A2....A35....Write 

d.r2pt.ss WRo,DI,FRo // WR5 - N4 L4 K4 N3 N+M EI4 # ER4 FRS 
fld.d 8 (FEtch)++,AR . 
d.pfsub.ss ARo,BRo,ER// N4 L4 K4 N3 DR5 FI3 EI4 ER4 
adds -52,offset,somecount // reset bla counter 
d.ratls2.ss AIo,BIo,EI// - N4 L4 K4 DI5 DR5 FI3 EI4 
adds winer,wind, wind 
d.i2st.ss WIo,DR,FIo// WI5 . M4 - N4 K4 K=-L DI5 DRS FI3 
adds <-1,groups, groups 
d.fnop 

fld.d offset (FEtch) ,BR 
d.fnop 

bne.t quickstart //branch on value of groups 
d.fnop 

fst.q FR, offset (STore) 
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end_last_bfly:: Ae 
d.fnop ae 

br endit 
fiadd.ss f0,f0,f0 , . 
fst.q FR, offset (STore) //repeated for bnce.t untaken case 
ealign -quad 


offset _l: 

// want FEtch=0, 2,4,6, 8... elements. ASSUMING wincr=0, 

// and that w=(1,0), so that no complex mult needed, and NO W will be fetched. 
// E=xA+B, F=aA-B. (Per double-butterfly loop: 8 pfadd,4 dword fld, 4 fst, 

// 1 bla) (fld.q required, to reduce # flds to avoid pipe stalls) 

// Performance = 4 cyc/bfly best case. 


//Redefine regs for fld.q,fst.q usage, when A and B adjacent: 
define(ARS,f12) //element A, real component 

define(AI3,f13) // " ", imag 

define(BR3,f14) //element B, real component 

define (BI3,f15) 

define(AR4,f16) // extra A value, for prefetch 

define (AI4,f17) 

define(BR4,f18) // extra A value, for prefetch 

define (BI4,f19) 


define(ER3, £20) //A+B, real (ER = AR + BR) 
Gefine(EI3, f21) // " imag " 
define(FR3, £22) //(A=B), real 
define(FI3, £235) // " imag " 


define (ER4,f24) //A+B, real, extra copy 
define (EI4,f25) // " imag 


define (FR4, £26) 
define (FI4,f27) 
a 
adds =-16,astart,FEtch 
fld.q 16 (FEtch) ++, AR4 — 
adds . -1,groups, Somecount W bla counter (predecremented already by 1) 
//using groups=blacount on the offset.l loop, intentionally. 
adds  -16,FEtch,STore 
//startup the loop: 
| Hl oetmneieatemiatenterntnetetaietaietaten LL AlecceccAZecccccASecee Writes 
d.pfadd.ss AR4, BR4,f0 § // ARn+BRn = - - 
fld.q 16 (FEtch)++,AR3 
d.pfadd.ss AI4,BI4,f0 // AIn+BIn ERn - - 
adds -2,r0,decrem //2 bflies per loop 
d.pfsub.ss AR4,BR4,f0 // ARn=-BRn EIn ~~ ERn - 
bla decrem,Somecount, offsetl.loop //init LCC 
d.pfsub.ss AI4,BI4,ER4 // AIn=-BIn FRn EIn  ERnext 
nop 


// ee a ep Eee ee ere ee Teen // AleeseccAQecceecAdeee.. Write: 
offsetl_loop:: 
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d.pfadd.ss AR3,BR3,EI4 // AR+BR FI- FR EI- 
nop 

d.pfadd.ss AI3,BI3,FR4 // AI+BI ER FI- FR- 
fld.gq 16 (FEtch)++,AR4 

d.pfsub.ss AR3,BR3,FI4 // AR=BR EI ER . FI- 
fst.q ER4,16(STore) ++ 

d.pfsub.ss AI3,BI3,ER3 // AI=BI FR EI ER 
nop 

d.pfadd.ss AR4,BR4,EI3 // AR2+BR2 FI FR EI 
fld.q 16 (FEtch)++,AR3 

d.pfadd.ss AI4,BI4,FR3 // AI2+BI2 ER2 FI FR 
nop 

d.pfsub.ss AR4,BR4,FI3 // AR2=-BR2 EI2 ER2 FI 
bla decrem,Somecount, offsetl_loop 

d.pfsub.ss AI4,BI4,ER4 // AI2-BI2 FR2 EI2 ERnext 


fst.q ER3,16(STore) ++ 
end_offsetl_loop:: 
d.fiadd.ss f0,f0,f0 

br endit 

fiadd.ss f0,f0,f0 

nop 


ealign .quad 

offset_2:: 

// want FEtch=0,1;3;4,538,93;12,13;... elements. 

// ASSUMING wincr=N/4 (W addr=0,N/4,0,N/4,0,...). Trivial W() factors. 
// USE bla loop, incrementing FEtch by 16 (2*offset). 

// Even-indexed elements identical to offset_1,W=WO, no complex mult. 
// So FReven=(AR=BR), Fleven=(AI-BI). 

// Odd components have W=(0,=-1). So FRodd=(AI-BI), FIodd=(BR=-AR). 

// Each fld.q fetches AReven,AIeven,ARodd,AIodd. 


//Assume ER,EI,ERo,EIo are 4 contiguous regs. 
//Assume FR,FI,FRo,FIo are 4 contiguous regs. 


adds -16,astart,FEtch 
fld.q 16 (FEtch)++,AR 
fld.q 16 (FEtch)++,BR 


adds O,groups,Somecount //bla counter 
//startup the loop: 
// ef / Alss6 eet A2s<é oes ASecies. Write: 


pfadd.ss AR ,BR ,f0 // AR+BRe ~ 
pfadd.ss AI ,BI ,f0 // AI+BIe ER - 


d.pfadd.ss ARo,BRo,f0 // ARo+BRo EI ER - 
nop 

d.pfadd.ss AIo,BIo,ER // AIo+BIo ERo EI ER 
nop 

d.pfsub.ss AR ,BR ,EI // AR=BRe_ EIo ERo EI 
adds -l1,r0,decrem //2 bflies per loop,but groups is half desired value. 
d.pfsub.ss AI ,BI ,ERo // AI=-BIe FR EIo ERo 
adds -16,astart,STore 

d.pfsub.ss AIo,BIo,EIo // AIo=-BIo FI FR EIo 
bla decrem,somecount, offset2_loop //init LCC 

d.pfsub.ss BRo,ARo,FR // BRo=ARo FRo FI FR 
nop 
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16 (FEtch)++,AR //fetch AR,AI,ARo,AIo 


16 (FEtch)++,BR //fetch BR,BI,BRo,BIo 
// Ale ove we AZevcnwchScd< oa Rrite: 


// AR+BRe FIo 
nop 
a.pfadd.ss AI ,BI ,FRo // AI+BIe ER 
nop 
d.pfadd.ss ARo,BRo,FIo // ARo+BRo EI 
fst.q ER ,16(STore)++ 
//update ER ,EI ,ERo,EIo 
d.pfadd.ss AIo,BIo,ER // AIo+BIo ERo 
nop | 
d.pfsub.ss AR ,BR ,EI // AR=-BRe_ ETIo 
nop | | 
d.pfsub.ss AI ,BI ,ERo // AI=-BIe FR 
fst.q FR ,16(STore) ++ 
d.pfsub.ss AIo,BIo,EIo // AIo=-BIo FI 
bla decrem,Somecount,offset2_loop 


d.pfsub.SS BRo,ARo,FR // BRo=ARo FRo 


nop 


endits:: 

// restore regs 

fiadd.ss £0,f0,f0 //exit DIM 

fld.q O(sp) ,fl2 

fiadd.ss f0,f0,f0 //last DIM pair 
fld.q 16(sp) ,f8 

adds 52,S5p,Sp 

bri rl 
nop 

| |rownnnnn nn ene nn nnn nnn n- eneano-==- 


FRo 
Flo 


ER 


EI 
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ec difstepf.f: do one stage of fft (DIF) butterflies 
ec (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 


c Decimation in Freq, radix-2, inplace, l-dimen 
ec 6/20/89 


c Do one entire stage (n/2 butterflies). Sample invocation: 
c call difstep(a,w,groups,offset,wincr) 


c Inputs: 

Cc A= complex array of input, single-prec float 

Cc (complex stored as 4byte real, 4byte imag contiguously) 

c W= pointer to array of twiddle factors. Assuming W(k) is 

Cc CMPLX(cos(2pi*k/N) ) ,-sin(2pi*k/N)) for k=0 to (N/2)-l. 

Cc offset = distance (in "“elements") between 

c the 2 input values for each butterfly 

Cc groups = number of sub-DFTs this stage is split into. 

Cc (groups*offset*2 = N) 

Cc wincr = distance between successive w values for successive butterflies 
Cc 


c Outputs: 
c A= complex butterflied version of input. 


SUBROUTINE difstep(a,w,groups,offset,wincr) 
integer groups,offset,wincr 
integer i,j,indexl,iplus . 
complex a(groups*offset*2) ,w(groups*offset) ,wtemp,temp 
Ce ae nw coe coe woes oe cams em Ore es cans ee Oe CS ME ON NE GS ea SD KR SG NY A GN SS ON GM SND SEN GO ME SY MD NS OT SOND GE SH GN GO GS SED SH OO SD GD SoD 
c We implement a... . 
c Special case for offset=l(last stage): no complex multiplies, simple add 
c (Performance enhancement) 
IF (offset .eq. 1) THEN 
CVD$ NODEPCHK 
DO 8 i = 1, (2*groups) ,2 
iplus = i+kl 
temp = a(iplus) 
a(iplus) = a(i) = temp 
8 a(i) = a(i) + temp 


C Special case for offset=2 (next-to-last stage): no complex multiplies, 
cc simple add. (Performance enhancement) 
cc For half the butterflies, W=(1,0). For the other half, W=(0,-L) 
IF (offset .eq. 2) THEN 
CVD$ NODEPCHK 
DO 90 i = 1, (4*groups) ,4 
iplus = i + 2 
temp = a(iplus) 
a(iplus) = a(i) = temp 


90 a(i) = a(i) + temp 

C 2nd call to i-loop: w=cmp1x(0,-1.) 
CVD$ NODEPCHK 

CVD$ NOVECTOR 


DO 92 i = 2, (4*groups) ,4 
iplus = i+ 2 


temp = a(i) = a(iplus) 
a(i) = a(i) + a(iplus) 
) 


92  a(iplus) = CMPLX(AIMAG(temp) ,-REAL (temp) ) 
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ELSE 
Cdnikwonnwemaes . 
ec "DO 20" indexl-loop is "outer loop" — 
CVD$ VECTOR 
CVD$ NODEPCHK ; 
DO 20 indexl = 1, (2*offset*groups) , (2*offset) . 
hi — a 
CvD$ NODEPCHK 
CvD$ ALTCODE 
DO 10 i = indexl, (indexl+offset-1) 
iplus = i + offset . 
temp = a(i) = a(iplus) 
a(i) = a(i) + a(iplus) 
a(iplus) = w(j) * temp 
10 j = j + winer 
20 CONTINUE 
ENDIF 
ENDIF 
RETURN 
END 


ceccccccccecececcceccccceccceccceccece 
subroutine fetch(a,n) 
integer n 
complex a(n) ,temp 
ce Kludge do-nothing prefetch. 
temp = a(1l) 
RETURN 
END 
ccceccececcecccececccecccececcecce 
subroutine bitrev(a,dummy,n) 
C Bit-Reverse : 
C Inputs: : : 
A= complex array of input, single-prec float 
dummy = Zval(m). Probably unusable from Fortran. 
N = number of input points (and output points) 


one mR? 


C Ouput: : 
C A = original A data, but in bit-reversed order from A 


integer n,i,j,k,ndiv2 
complex a(n),temp 


C "DO 7" loop to in-place-bit-reverse-shuffle output 
j=l | 
ndiv2 =n / 2 
DO 7 i=l, n-l 
TF (L.-4410<- <j) THEN 


temp = a(j) 
a(j) = a(i) 
a(i) = temp 
ENDIF 
k = ndiv2 


C "While (j .gt. k)" /*decrease j by 2**something */ 
6 IF (j «gt. k) THEN 
j = j-k 
k=k/2 
GOTO 6 
ENDIF 
C Add next lower power of 2 to j 
7 js j+k 
RETURN 
END 
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// bitrev.ss 
// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 


// BIT-reversal of 8byte array elements. 
// IN PLACE. 
// (Allows arrays of 8,16,52,64,128,256,512, or 1024 elements) 


// Invocation: (from Fortran) 
// call bitrev(a,%VAL(m) ) 


// a= rl6 = pointer to array of 8byte elements 

// m= rl7 (call by value)= base-2 log of total number of elements 
// (2**m = N) 

// Outputs: 

// a= Bit-reversed ordered version of A 

// 


// Expected best-can-do performance, and measured performance= 
// approx 4*N clocks (0.06 mSec for 512 points) 

| |nnnennee neon -- === - 

define(astart, rl6é) //initial input data base address 
define(m, rl7) 

define (logN,r17) 

define (destl,rl9) 

define (dest2,r20) 

define (dest3,r21) 

define (dest4, r22) 

define(iptr, r23) //index<array pointer 


define(decrem,r24) //bla decrement 
define(count,r25) // bla counter 


etext 
~align .quad 


//fetch base address for index table (rbasetab) 
// base-addr-table elements = (baseaddr, number_of_swaps-=2) 
// base-addr=-table indexed by logN. 


shl 3,logN,r30 //scale to 8=byte-entry length 

mov rbasetab,r29 

lds r29(r30), iptr 

addu 4,r29,r29 

ld.l r29(r30), count //number of swaps required for this value N 


pfid.d O(iptr) ,fO //initiate fetch of first 2 bit-rev indices 
pfld.d 8(iptr)++,f0 

adds -2,r0,decrem//2 Swaps per loop 

pfld.d 8(iptr)++,f0 


bla decrem,count, revloop /finit LCC 
pfld.d 8(iptr)++,f16 //get 2 indices, but don't cache the indices 
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revloop:: //2 swaps per loop 
//7.5 cycles consumed for each Swap, best case. 
pfld.d 8(iptr)++,f18 //2 more indices 
fxfr fl6,destl //transfer to integer index regs 
fxfr f17,dest2 | 
fld.d destl (astart) ,f24 //fetch 2 elements to Swap 
fld.d dest2 (astart) ,?26 
fxfr f18,dest3 
fst.d f24, dest2 (astart) 
fst.d f26, destl (astart). 
fxfr f19,dest4 
fld.d dest3 (astart) ,f28 
fld.d dest4 (astart) ,?50 
pfld.d 8(iptr)++,f16 //2 more indices 
fst.d £28, dest4 (astart) 
bla decrem,count, revloop // 
fst.d £30, dest3 (astart) 


bri rl 


// fetch8_: Touch all 32-byte lines in the 8k dats patos: to get them 
//| into deache. (ASSUMING .lte. 8Kbytes.and .gte. 4Kbytes) 

// 7 | 

// Invocation= fetch(astart,num8) 

// Inputs= 

// “astart=rl6=pointer to data which te: to be touched. 

// num8=rl7 (passed by VALUE, %VAL(), not by reference) 


// Using RC and RB to improve dcache hit rates, ROR FFTs bigger than 

// 1024 complex. (8kB). 

// RC=10 causes Feprecemeny only of block denoted by RB lsbit. RC=11 disables 
// replacement. 

| [----==~ - 

define (num8,rl17) 

define (FEtch, r26) 


_fetch8_:: 
fetch. 3: 
ld.c dirbase,r30 
or 0x800,r50,r30 // Replace Deache slot 0 only (RC=10,RB=00) 
St.c r30,dirbase 
// Put 4Kbytes into Dceache slot 0. (The rest after 4kB goes to Slotl). 


adds -4,r0,decrem //4 8=byte-groups per cache line 
adds 508,r0,count //512, but pre-decremented for bla usage 
bla decrem,count,floop 
adds ~32,astart,FEtch 
floop:: ae 
bla decrem,count,floop 


fld.d 3$2(FEtch)++,f30 //dummy load. 


adds. -512,num8, count 

be fdone //if data exhausted, ait 

// id.c dirbase,r30 
or 0x900,r30,r50 // Replace Deache slot 1 only (RC=10, RB= 01) 
st.c r30,dirbase 
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adds ~8,count,count //predecr for bla 

bla decrem,count,floop2 //set LCC 
fld.d 52(FEtch) ++, £30 

floop2:: 


bla decrem,count,floop2 
fld.d 32(FEtch)++,f£30 //dummy load. 
fdone:: 


// unlock dcache 

andnot OxF00,r30,r30 //clear RC,RB (dirbase(11:8) ) 
Sst.c r30,dirbase 

bri rl 


// rbasetab:: (Table of bit-reversed indices for bitrev subroutine) 
// base-addr-table elements = (baseaddr, number_of_swaps-2) © 
J/ bpaSe~addr=-table indexed by logN. 

ealign .quad 

rbasetab:: 

elong [6]0 //don't bother with log(n)=0,1,2 

elong revs, 0 

elong revl6, 4 

elong rev3s2, 10 

elong rev64, 26 

elong revl28, 54 

elong rev256, 118 

elong rev512, 238 

elong revl024, 494 


//number of swaps=240 for N=512 (ie, 32 symmetrical patterns 

// exist between 0 and 511.) 

// rev512: array of bit-reversed indices, for N=512. 

// Each entry is ("i", and "bit-reversed-i"), shifted left by 3 
// to account for 8=byte-elements. 

// NOTE: This listing DOES NOT SHOW all the table elements, to save paper. 


-align .quad 

-rev51233 

elong 8, 2048, 16, 1024 
elong 24, 3072, 32, 5l2 
elong 40, 2560, 48, 1536 
Jf Bi Cees gy BIC views. BIC 


-Aalign .quad 

revl024:; 

-long 8, 4096, 16, 2048 
elong 24, 6144, 52, 1024 
-long 40, 5120, 48, 3072 
-Llong 56, 7168, 64, 512 
// ElCeewy BICecs, BICeus 
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//Number of swaps = 496 
. J/N (Number of elements) = 1024 


align .quad 

revl6:: 
elLong 1*8,8*8,2*8,4*8 
elong 3*8,12*8,5*8,10*8 
elong 7*8,14*8,11*8,13*8 

rev833: 
elong 1*8,4*8,3*8,6*8 


ealign .quad 

revs2:: 

elong 8, 128,16, 64, 24, 192, 40, 160, 48, 96, 56, 224 _ 
elong 72, 144, 88, 208, 104, 176, 120, 240, 152, 200, 184, 232 


ealign .quad 

rev6433_ 

elong 8, 256, 16, 128 
elong 24, 384, 32, 64 
elong 40, 320, 48, 192 
elong 56, 448, 72, 288 
J/ BlCcce, BICcoce, ETC. 


~align .quad 

revl28:s , 

~long 8, 512, 16, 256 

elong 24, 768, 32, 128 

elong 40, 640, 48, 384 

elong 56, 896, 72, 576 

J/-ElCewees ElCee. vy) BiCias 

//Number of swaps = 56 (Number of elements) =128 


eAalign .quad 
rev256:: 
elong 8, 1024, 16, 512 
elong 24, 1536, 532, 256 
-long 40, 1280, 48, 768 
~long 56, 1792, 64, 128 
Jd ElCucay ElCvsaced, BICccs 
 J/Number of swaps = 120, N (Number of elements) = 256 
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PROGRAM FFITEST 
1l=-D FFI TEST PROGRAM 
Intel assumes no reSponsibility for use or misuse of this code. 


7/20/89 


character*8 REALLY 
PARAMETER (IREV=0) 
PARAMETER (REALLY='complex') 
PARAMETER (TIMEIT=1, CACHETIME=0) 
DATA IT/200000/ 
PARAMETER (N=1024,M=10) 
- PARAMETER (N=512,M= 9) 
PARAMETER (N=256,M= 8) 
PARAMETER (N=128,M= 7) 
PARAMETER (N=64,M= 6) 
PARAMETER (N=32,M= 5) 
PARAMETER (N=16, M=4) 
PARAMETER (PI=5.1415926536) 
COMPLEX X(N) ,X1(N) ,X2(N) ,X3(N), W(N/2) 
Fortran complex values stored R,I, R,I for arrays. 
Real ASQR(N) ,ASQR2(N) ,XR(N) 
complex wtemp 
real rtemp 


PRINT *,' FFT test program (ffttest.f) ....' 
print *, ‘sessssssssssssssassssSsssss====' 
IF (IREV .eq. 0) THEN ; 
print *,'NOT counting time for bit-reversal.' 
print *,'DO NOT expect matching answers,without bit-rev' 
ELSE 
print *, ‘Time for bit-reversal included.' 
ENDIF 


print *, ‘Time for cache writeback and fills...' 
IF (CACHETIME .eq. 0) THEN 
print *,' NOT included, if iterating.' 
ELSE : 
print *,' ... included.’ 
ENDIF 


‘If iterating... Number of Iterations =',IT 


, "Number of Points = ', N 
‘ ‘(', REALLY, ' data) ' 
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Cieeecacicn cee re 
C Init twiddle factor array oie) with (cos,-Sin) of 2pi*k/N 
C (Should just declare this as constant, if N is non-variable) 
C (OR could have one constant 512-entry W uot N=1024), adjust wincr accordingly 
C in diff.f for smaller N) | 
rtemp = 2.0*pi/N 
wtemp= CMPLX(cos(rtemp) , sHiniveiat. 
w(l) = (1.0, 0.0) 
DO 200 k = 2,N/2 
200 w(k) = wtemp * w(k-1) 
cc print *,' W (twiddle) initialization completed. .....' 
ccceccccecceccececececcccceecceccccececccc 
C INITIALIZE input data 
C 
PIN = (4*PI)/ N 
DO 100 I=1,N 
ec For testing with sinewave input data: 
Cc Treal = COS( I*PIN) 
c Timag = SIN( I*PIN) 


c For testing with Squarewave input: 
cc IF (I .1t. N/2) THEN 


cc Treal 


= 1.0 
cc Timag = 0.5 | 
cc ELSE 
cc Treal = 0.0 
ec Timag = 0.0 


cc ENDIF 

C For testing with ramp function input data: 
Treal = I = 1.0 
Timag = Treal + 0.5 
X(I) = CMPLX (Treal, Timag) 


X1(I) = CMPLX (Treal, Timag) 

X2(I) = CMPLX (Treal, Timag) 

X3(I) = CMPLX (Treal, Timag) 
100 CONTINUE 


C 
cceccccccececcececcccocccccccccccecccccccccccce 
IF (TIMEIT .ne. 0) THEN 


CALL fft (X2, M, N) 
ee Subroutine fft is Decimation-In-Time, Fortran version. 


Cc CALL ditt(X, M, N,W,IREV) 
CALL diff(X, M, N,W,IREV) 
ENDIF 


ececcecccccccccececcceccecececcceccccccc 
IF (IREV .ne. 0) THEN 
IF (TIMEIT .eq. 0) THEN 
call vcompare (X,X2,2*N) 
call cmags(X,N,ASQR) 
c cmags to take Squared magnitude of complex values 
call cmags (X2,N,ASQR2) 
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C print non-zero results: 
J=0 
DO 700 I =1,N 
IF ((ASQR(I) .GT. 1.0) .OR. (ASQR2(I) .GT. 1.0)) THEN 
WRITE (6,22) (I=-1), ASQR(I), ASQR2(I) 
22 FORMAT (' I-1=',14,' ASQR(I)= ',F14.2, ' ASQR2(I)= ',F14.2//) 
J= J+l1 
IF (J .GT. 32) GOTO 725 
ENDIF 
700 CONTINUE 


725 CALL TIME 
ENDIF 
ENDIF 


IF (TIMEIT .ne. 0) THEN 
cccececccecceccccccecceccecececccccececcce 
cc= Timing loop follows: 


print *,' Start Ass.FFT' 

IF (CACHETIME .eq. 0) THEN 

DO 500 I = 1, IT,4 

C Reuse Same array, so 
CALL diff (X, 


cache fill and writeback time NOT included. 
M, N, 
CALL diff(X, M, N, 
M, N, 
M, N, 


IREV) 
IREV) 
CALL diff (X, 
500 CALL diff (X, 
ELSE 
DO 504 I=1, IT,4 
C Alternating between X,X1,X2,X3 should provide cache misses. 
CALL diff(X, M, N,W,IREV) 
CALL diff(Xl, M, N,W,IREV) 
CALL diff(X2, M, N,W,IREV) 
504 CALL diff(X3, M, N,W,IREV) 
ENDIF 
print *,' END Ass. FFT' 
ecceccececceccccececceccceccccccccccece 
ENDIF 
STOP 
END 


W, 
W, 
W, 
W, 


6-123 


1 — ap-ass «PRELIMINARY 


subroutine vcompare (res, exp,n) 
c VCOMPARE compares 2 REAL vectors, prints out lst few miscompares 
c . 
integer n, errecnt 
real res(n), exp(n) 


write (6,12) 
12 format ('*** VCOMPARE: vector comparison beginning ***') 


data errecnt/0/ 
do 30 i=l,n ; 
if(AINT(res(i)). .ne. AINT(exp(i))) then 
c {print out error, exit if alot already} 
120 print *,'*** Error in compares ***! 
write(6,121) i 
121 7 format(' Item number = ',I6) 
write(6,124) res(i), exp(i) 
124 format(' Res_=',F14.2,' Expected_=' ,F14.2) 
errent = errcnt + 1 
if (errent .gt. 19) then 
return 
end if 
end if 
30 continue 


if (errent .eq. 0) then 
190 print *," *** vector compares SUCCESSFUL ***! 
end if - 


99 return 
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C File: ditt.f 
C 6/15/89 


C Intel assumes no responsibility for use or misuse of this code. 


C FFT - Decimation in TIME, radix-2, inplace, l~dimen 

C Inputs: 

C A= complex array of input, up to 1024 pts, Single-prec float 
C M= log of number of pts 

C = (Number of stages of FFT) 

C N = number of points. ie, N= 2**M = number of pts 

C W= complex array of twiddle factors, length=N/2. 

C REV= ignored parameter. 

C 
C 
C 
C 


Outputs: 
A= complex fft of input A. Correct order (bit-reversal done). 
cecccccecccccccccccececccececccecccceccccccccccceccccccccceccccccccccce 


Subroutine ditt(a,m,N,W,REV) 

integer m,N, i, REV,wlimit 

integer offset, stage, groups, wincr,powers2(0:10) 
complex a(n) ,w(N/2),temp 


data powers2 /1,2,4,8,16,52,64,128,256,512,1024/ 
C Powers2 to avoid calls to POW, DIV 


C Twiddle factor array w(i) has (cos,=<sin) of 2pi*i/N 
CC Assume the caller provides w(i) constants ALREADY initialized 
Pace ee 
C Pre=touch data, lock into cache, for 8kByte fft: 
IF (N .gt. 5135) THEN 
call fetch(a,%VAL(n) ) 


call bitrev(a,%VAL(M) ,n) 
C Bitreversal of input needed for in-place decim in time FFI, to avoid 
C fetching twiddle-factors in bitrev order. 

wlimit = 8*((N/2) =- 1) 


DO 20 stage = l,m 

groups = powers2(m-=Sstage) 
C groupS=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 


C offset gets 1,2,4,8,...N/2 
offset = powers2(stage=-1l) 
winer = groups | 
call ditstep(a,w,groups,offset,wincr,wlimit) 

20 CONTINUE 


RETURN 
END 
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// ditstep.ss: do one stage of fft butterflies 
// DIT = Decimation in Time, radix-2, inplace, l=-dimension 

// (C) Copyright 1989 INTEL Corporation. ALL RIGHTS RESERVED. 
// ici 


// Do one entire stage (n/2 butterflies). Sample invocation: 
call ditstep(a,w,groups,offset,wincr,wlimit) 


Inputs: 


// A= complex array of input, sinalaspres float 

// (complex stored as 4byte real, 4byte imag contiguously) 

// W= pointer to array of twiddle factors. Assuming W(k) oi 

// CMPLX (cos (2pi*k/N)),-sin(2pi*k/N)) for k=0 to (N/2)- 

// offset = distance (except for scale-by-8byte nt eter between 
// the 2 input values for each butterfly. 

// Offset also is the number of butterflies done per "group". 


// groups = N/(2*offset). The number of sub-DFIs this stage is split into. 
// wincr = distance (except for scale-by-8byte sizeof (complex) ) between 
S/ Successive w values for successive butterflies 

wlimit =max index, in bytes, of W table. 


Outputs: 
A= complex radix-2 butterflied version of input. 


define(astart, rl6) // input data base address 
define(wstart,rl7) //twiddle drray ptr. Because w-contents depend on N, 

// we will assume the caller has initialized w() array. 

define(groups,rl8) //groupS=number of Sub-DFTIs this stage is split into. 
define(offset,rl9) //offset (initially elements, mult by 8 to get bytes) 
// between node and its dual (the 2 numbers to butterfly, ie. A and B) 
define(wincr,r20) //increment between successive W values. Remains constant 
// within a given stage. 

define(wlimit,r2l) //max index, in bytes, of W table. 

define(wind,r22) //current index, in bytes, of W table. 

define (offset2,r23) //offset*2 


define (decrem,r24) //bla decrement 
define (Somecount,r25) // bla counter 


define (FEtch, r26) 
define (STore,r27) 


//pointer to 1st component of butterfly (load) 
// " " lst component of butterfly (store) 


define (offsetp8,r28) //offset+8 
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// f4:f7 spare 

define(ARe,fl2) //element A, real component 
define(Ale,f13) // " ", imag . 
define(ARo,f14) // extra A value, for prefetch (o="odd") 


define (AIo,f15) 

define(BRe,fl6) //element B, real component 
define (Ble,f17) 

define(BRo,f18) // extra B value, for prefetch 
define (BIo,f19) 


define(ERe,f20) //A+(B*W), real (ER = AR + BR) 
define(Ele,f2l) // " imag " 

define (ERo,f22) // previous loop's value 
define (EIo,f235) // " imag " 


define(FRe,f24) //A-(B*W), real 
define(FlIe,f25) // " imag " 
define(FRo,f26) // previous loop's value 
define (FIo,f27) // " imag " 


define(PR, £28) //(B*W), real 
define(PI, f29) //(B*W), imag 


define(WRe,f30) //W (twiddle factor), real part 
define(WIe,f51) // " " , imag 


define(WRo,fl0) //W (twiddle factor), real part (EXTRA copy) 
Gefine(WIo,fll) //" " , imag 


etext 
align .quad 
~ditstep_:; : 
ld.l O(groups),groups //fix Fortran call-by-ref 


ld.l O(offset),offset // 
shl 53,offset,offset // change from elements to bytes 
shl l,offset,offset2 
adds 8,offset,offsetps8 


fst.q f8 ,-16(sp)++ //save "local" regs 
fst.q £12,-16(Sp)++ // * " 


adds -l,groups,groups // pre-decrement for bne usage, or bla usage 
adds -16,r0,decrem //bla decrement 


// We code the last 2 stages as Special cases: 


| |------- - 
xor 8,offset,rO //offset=l, special case, no complex mult, funny addressing 
be offset_1// (ASSUMING offset=l means wincr=0, and no twiddle used) 
xor lé6,offset,rO //offset=2, special case, no complex mult 
be offset_2 | 
| [------- - 
ld.l O(wincr) ,wincr 
ld.1 O(wlimit) ,wlimit 
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pfadd.ss 
pfadd.ss 
pfmul.ss 
pfmul.ss 
pfmul.ss 


f0,f0,f0 
f0,f0,f0 
f0,f0,f0 
f0,f0,f0 
f0,f0,f0 
f0,f0,f0 


re PRELIMINARY 


// init A1l,A2,A3=0 


[/-<---=- = 
// init pointers: ; 
shl 3,wincr,wincr //scale for bytes. 
shi l,winer,wind //init wind =2*wincr 


pfld.d 0 ( wstart),f0 
pfld.d wincr ( wstart),f0 


adds -8,astart, FEtch 
pfld.d wind (wstart) ,f0 
adds wincr,wind,wind //wind now 3*wincr 


// nere fetch first set of B,W before bla-loop 
pfld.d wind (wstart) ,WRe 

adds wincr,wind,wind 
//first Bfetch from offset, then lst afetch from 0. 
fld.d offsetp8 (FEtch),BRe //first B value 


and wlimit,wind,wind //modulo-wlimit the w index 

// We do modulo-addressing on W(), to keep the pfld pipeline full. We 
// never do a W-fetch beyond the end of the table. 

// And the modulo-check needs to be done only every 4th pfld, as always 
// we use a multiple of 4 W() factors. 


d.rgapl.ss f0,f0,f0 //clear Treg. 
adds -32,offset,sSomecount VI bla counter (predecrement by 4 ‘elements) 


// Definitions for pipe diagram: 


// Anew = E = A+(B*W) 

// Bnew = F = A=-(B*W) 

// Let P=(B*W). 

| amen 

// (the complex multiply product, P, broken into 4 real malt and 2 adds) : 
// WR = cos(), WI=-sin(). 

// PR= XK = L; where K= WR*BR, L=WI*BI 
// PI =N + M3; where N= WI*BR, M=WR*BI 
// ER = AR + PR (Overwrites AR) 

// EI = AI + PI ( . ‘ AT) 

// FR = AR = PR ( . BR) 

// FI = AI = PI ( . BI) 


// For lst time thru inner=-loop, don't have correct values to store. 
// Must do 1 loop before the loop, sans the stores. 


| |---------------- - 
first_bfly:: y/tild pipe 
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// KR..KI.. Ml... .M2....M3 T Al....A2....A5....Write 

d.r2pt.ss WRe,f0,f0 // WRe = - - - ~ - - - - 
pfld.d wind (wstart) ,WRo 

d.i2gst.ss WiIe,f0,f0 // Wie 

adds wincr,wind,wind 

d.r2apl.ss f0 ,BRe,f0O // KO - - - - - ~ ~ 
fld.d 8 (FEtch)++,ARe //first A value 

d.pfmul.ss WIe,BlIe,f0 // LO KO - - - - - - 
pfld.d wind (wstart) ,WRe 

d.r2pt.ss WRo,BIe,f0O // WRo MO. LO KO - - - ~ -. 
fld.d offsetp8 (FEtch) ,BRo 

d.ratls2.ss f0O ,PR ,f0// - MO LO KO = - - ~ 
adds wincr,wind,wind 

d.i2st.ss WIo,BRe,f0 // WIo NO - MO KO K-LO - - - 
nop | 

(ECCS CCT CECE Te eS Ce OTe eT Te Te Te CC ET eT ER TEE Ee ee 

d.r2apl.ss fO ,BRo,f0 // Kl NO ~ MO = PRO 

and wlimit ,wind,wind 

d.pfsub.ss fO ,PI ,f0 // Kl NO - MO - - PRO 
fld.d 8 (FEtch)++,ARo 

d.pfadd.ss ARe,PR ,PR // KL NO - MO ERO ~ - PRO 
fld.d offsetp8& (FEtch) ,BRe 

d.pfmul.ss WIo,BIo,f0O // Ll Kl NO MO ERO - - - 
nop ; 

d.r2pt.ss WRe,BIo,fO // WRe Ml il Kl MO M+NO ERO - - 
bla decrem,Somecount,restart //init LCC 

d.ratls2.ss ARe,PR ,f0// - Ml Ll Kl FRO PIO ERO - 
nop ; 

restarts: 

d.i2st.ss Wle,BRo,ERe// WIe Nl - M1 Kl K-Ll FRO PIO ERO 
adds -l6,astart,STore // ptrs init 16 low, for fst.q instructions 
| |------------------ ~ 

// Each butterfly = 1 complx multiply, 1 complx add, 1 complx subtract 
//= 4 multiply, 3 add, 3 subtract 

// 3 8=-byte fetches (A, B, W) 

// 2 8=byte stores (A, B) 

// = | 

// 7 cycles per butterfly 

ff 

// inner_loop: iterates "offset/2" times 

// for each group. It does 2 butterflies per iteration 


// AR/AI fetches need to be a cycle behind BR/BI fetches here. So we 
// must index with offset+8 into B. 

// AR.is used 1/2 loop before AI. 

// Pattern= AIO,AR1,BR2,BI2;AI1,AR2,BR3,BI3. 


inner_loop:: // KR..-KI...Ml....M2....M3 it Al.eeeAZ.cesA 35....Write 

d.rgapl.ss AlIe,BRe,PI // ' K2 Nl - Ml EIO PR1 FRO PIO 
pfld.d wind (wstart) ,WRo 

d.pfsub.ss AIe,PI ,FRe// K2 Nl ~ Ml FIO EIO PRL FRO 
fld.d 8(FEtch)++,ARe : 

d.pfadd.ss ARo,PR ,PR // K2 Nl ~ ML ER1L FIO EIO PR1 
fld.d offsetp8 (FEtch) ,BRo 

d.pfmul.ss WIe,BlIe,f0O // L2 K2 Nl Ml ER1L FIO EIO - 


adds wincr,wind,wind 
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_d.r2pt.ss WRo,Ble,EIe // WRo M2 L2 K2 M+N1 ER1 


pfld.d wind (wstart) ,WRe 

d.ratls2.ss ARo,PR ,Fle// - M2 L2 K2 FR1l PIL 
adds wincr,wind,wind 

d.igst.ss WIo,BRe,ERo// WIo N2 - M2 K2 K-L2 FR1 


and wlimit,wind,wind //modulo. 


// KR... eKI ee eMl. oe M200 eMS T Al...-A2...-A5...-Write 


d.r2apl.ss AIo,BRo,PI // K3 N2 - M2 EIl PR2 
nop 

d.pfsub.ss AIo,PI ,FRo// K3 + N2 - M2 FIl  EI1l 
fld.d 8 (FEtch)++,ARo. 

d.pfadd.ss ARe,PR ,PR //_ K3 + N2 - M2 ER2 FI1l 
fld.d offsetp8 (FEtch) ,BRe 

d.pfmul.ss WIo,BIo,f0 // L3 K3 N2 M2 ER2 FI1l 
nop 

d.r2pt.ss WRe,BIo,EIo // WRe M3 L3 K3 M+N2 ER2 
fst.q ERe,16(STore)++ //update ERe/EIe/ERo/EIo 

d.ratls2.ss ARe,PR ,FIo// - M3 L3 K3 FR2 PI2 
bla decrem,somecount, inner_loop 

a.i2st.ss WlIe,BRo,ERe// WIe N3 - M3 K3 K=L3 FR2 


fst.q FRe, offset (STore) 
//update FRe/Fle/FRo/FIo 


end_inner_loop:: //KEEP Pipelines full 
// RE-init pointers for fetches 
d.fiadd.ss f0,f0,f0 ; 

adds offset2,astart,astart //bump to next group 

//redo A,B fetches, with proper ptr. 

d.fiadd.ss f£0,f0,f0 

fld.d offset (astart),BRe //get first BR/BI in next group 
d.fiadd.ss f0,f0,f0 | | | : 
adds ~8,astart,FEtch 


last_bfly:: //do final 2 butterflies, start next group 


FIO EIO 
ER1L FIO 
PIl ER 
FRL PI1 
PR2 FR 
EIl PR2 
EIl  - 
FIL EI1 
ER2 FIL 
PI2 ER2 


// KR. eKI eo eMl. oe eM2e 0 oMS T Al....A2...-A5...-Write 


d.r2apl.ss Ale,BRe,PI // KO N3 - M3 EI2 PR3 
pfld.d wind (wstart) ,WRo | 

d.pfsub.ss AIe,PI ,FRe// KO N3 - M3 FI2 EI2 
fld.d 8(FEtch)++,ARe . . 
d.pfadd.ss ARo,PR ,PR // KO NS “ MS ERS FI2 
fld.d offsetp8 (FEtch) ,BRo 

d.pfmul.ss WIe,Ble,f0 // LO  ¥=KO NS M3 ERS FI2. 
adds wincr,wind,wind 

d.répt.ss WRo,Ble,EIe // WRo MO LO KO M+N3 ERS 
pfld.d wind (wstart) ,WRe | 

d.ratls2.ss ARo,PR ,Fle// - MO LO KO FR3  PI3 
adds wincr,wind,wind . 

d.igst.ss WIo,BRe,ERo// WIo NO = MO KO K=LO FR3 
and wlimit,wind,wind //modulo 

EEE EE CCR COR CCE TEER CC CORTE REE CCOC CORE CER E TOPOL Te ee eee eee 
d.rgapl.ss AIo,BRo,PI // Kl NO - MO EIS PRO 
adds -32,offset,somecount // reset bla counter 

d.pfsub.ss AIo,PI ,FRo// Kl NO = MO FIS EI3 


fld.d 8 (FEtch)++,ARo 
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FR2 PI2 
PRS FR2 
EI2 PRS 
EI2. - 

FI2 EI2 
ERS FI2 
PIS ERS 
FR3 PIS 


PRO FRS 


ntl Pore PRELIMINARY 
d.pfadd.ss ARe,PR ,PR // Kl NO ~ MO ERO FIS EIS PRO 
fld.d offsetp8 (FEtch) ,BRe 
d.pfmul.ss WIo,BIo,f0 // Ll Kl NO MO ERO FI3 EI3 - 
bla decrem,somecount,nowhere //re=-init LCC=1 
d.r2pt.ss WRe,BIo,EIo // WRe Ml Ll Kl M+NO ERO FI3 EI3 


adds -l,groups,groups 
nowhere :: 


d.ratls2.ss ARe,PR ,FIo// - Ml Ll Kl FRO PIO ERO FI3 
fst.q ERe,16(STore) ++ 
d.fnop 
bne.t restart //branch on value of groups 
d.fnop 


fst.q FRe, offset (STore) 


end_last_bfly:: 
ad.fnop 

br endit 

fiadd.ss f0,f0,f0 

fst.q FRe, offset (STore) //repeated for bnc.t untaken case 
align -quad 


offset_l:: 

// want FEtch=0,2,4,6,8,... elements. ASSUMING wincr=0, 

// and that w=(1,0), so that no complex mult needed. 

// E=A+B, F=A-B. (Per double-butterfly loop: 8 pfadd,4 dword fld, 4 fst, 
// 1 bla) (fld.q used to reduce # flds) 

// Performance = 4 cyc/bfly best case. 


//Redefine regs for fld.q,fst.q usage, when A and B adjacent: 
‘define(ARS,f12) //element A, real component : 
define (AI3,f15) // " ", imag 


define(BR3,f14) //element B, real component 
define (BI3,f15) 

define (AR4,f16) // extra A value, for prefetch 
define (AI4,f17) 

define (BR4,f18) 

define (BI4,f19) 


define (ERS, £20) //A+B, real (ER = AR + BR) 
define(EI3, f21) // " imag " 
define(FR3, £22) //(A=-B), real 
define(FI3, £23) // " imag 


define (ER4,’24) //A+B, real 
define (EI4,f25) // "* imag 
define (FR4,f26) //(A=-B), real 
define (FI4,f27) // " imag 


adds ~16,asStart,FEtch 

fld.q 16 (FEtch)++,AR4 

adds -l,groups,somecount // bla counter (predecremented already by 1) 
//using groups=blacount on the offset_l loop, intentionally. 

adds -16,FEtch,STore 

//startup the loop: 
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OO mmr eco eei m/f AloseessAZeeceecASseeee Writes 
d.pfadd.ss AR4,BR4,f0 // ARn+BRn - - 7 

fld.q 16 (FEtch)++,AR3 - | 
d.pfadd.ss AI4,BI14,f0 // AIn+BIn ERn ~- ,% 


adds -~2,r0,decrem //2 bflies per loop 
d.pfsub.ss AR4,BR4,f0 // ARn=-BRn EIn ERn - 

bla decrem,Somecount, offsetl_loop //init LCC 
d.pfsub.ss AI4,BI4,ER4 // AIn-BIn FRn EIn ERnext 
nop 
// ------ Ren Ii Bhs icecibeeisccsASccces Write: 
offsetl_loop:: ; 
d.pfadd.ss AR3,BR3,EI4 // AR+BR  FI- FR EI- 
nop | 
d.pfadd.ss AI3,BI3,FR4 // AI+BI ER FI- FR- 
fld.q 16 (FEtch)++,AR4 | 
d.pfsub.ss AR3,BRS,FI4 // AR=BR- EI ER FI- 
fst.q ER4,16(STore) ++ 
d.pfsub.ss AI3,BIS,ER5 // AI=BI FR EI ER 

nop : | 
d.pfadd.ss AR4,BR4,EI3 // AR2+BR2 FI FR EI 
fld.q 16 (FEtch)++,AR3 
d.pfadd.ss AI4,BI4,FR3 // AI2+BI2 ER2 FI FR 
nop o . & | 
d.pfsub.ss AR4,BR4,FI3 // AR2=-BR2 EI2 ER2 FI 

bla decrem,somecount, offsetl_loop | | 
d.pfsub.ss AI4,BI4,ER4 // AI2=-BI2 FR2 EI2 _ERnext 
fst.q ER3,16(STore) ++ , 
| mena - 


end_offsetl_loop:: 
d.fiadd.ss f0,f0,f0 
br endit 
fiadd.ss f0,f0,f0 
nop 
| |-------- 
 ealign .quad 
offset_2:3 
// want FEtch=0,134,538,9 ;12,15;... elements. Ba ae 
// ASSUMING wincr=N/4 (W_addr=0,N/4,0,N/4,0,...). Trivial W() factors. 
// Even=-indexed elements identical to offset_1,W=W0O, no pouprer mutt. 
// So EReven=(AR+BR), Eleven=(AI+BI). 
// So FReven=(AR=BR), Fleven=(AI=-BI). 


// Odd components have W=(0,-1). So B*W = (BI,=-BR). 
// So ERodd=Re (A+(B*W)) = (AR+BI) EIodd=(AI=BR). 
/// So FRodd=Re (A-(B*W)) = (AR=BI) FIodd=(AI+BR). 
// Each fld.q fetches AReven, Aleven, ARodd, Alodd. 


//Assume ERe,EIe,ERo,EIo are 4 contiguous regs. 


//Assume FRe,FlIe,FRo,Flo are 4 contiguous regs. 
//Assume ARe,AIe,ARo,AIo are 4 contiguous regs. 
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AP-435 
adds -16,astart,FEtch 
fld.q 16 (FEtch)++,ARe 
fld.q 16 (FEtch) ++, BRe 
adds O,groups,somecount //bla counter 


//startup the loop: 


// AlvewieehSeciws ehOssecew Write: 


PRELIMINARY 


pfadd.ss ARe,BRe,f0 // AR+BRe - 
pfadd.ss AlIe,BlIe,f0 // AI+BIe ER ~ 
d.pfadd.ss ARo,BIo,f0 // ARo+BIo EI ER - 
nop 
d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo EI ER 
nop 
d.pfsub.ss ARe,BRe,EIe // AR=-BRe ETIo ERo EI 
ads ~l,rO,decrem //2 bflies per loop,but groups is half desired value. 
d.pfsub.ss AIe,BIe,ERo // AI-BIe FR EIo ERo 
adds ~16,astart,STore | 
d.pfsub.ss ARo,BIo,EIo // ARo=-BIo FI FR EIo 
bla decrem,somecount, offset2_loop //init LCC 
d.pfadd.ss AIo,BRo,FRe // AIo+BKo FRo FI FR 
nop | 
offset2_loop:: 
d.fnop 
fld.q 16 (FEtch)++,ARe//fetch AR,AI,ARo,AIo 
d.fnop 
fld.q 16 (FEtch)++,BRe | 
Il] ------- TL BlewceccASZecececASe.eee-Writes 
d.pfadd.ss ARe,BRe,FIle // AR+BRe FIo FRo FI 
nop 
d.pfadd.ss Ale,Ble,FRo // AI+BIe ER FIo FRo 
nop 
d.pfadd.ss ARo,BIo,FIo // ARo+BIo EI ER FIo 
fst.q ERe,16(STore)++ //update ER ,EI ,ERo,EIo 
d.pfsub.ss AIo,BRo,ERe // AIo-BRo ERo EI ~ ER 
nop 
d.pfsub.ss ARe,BRe,EIe // AR=BRe EIo ERo EI 
®nop ‘ 
d.pfsub.ss AIe,Ble,ERo // AI-BIe FR EIo ERo 
fst.q FRe,16(STore) ++ 
d.pfsub.ss ARo,BIo,EIo // ARo-BIo FI FR EIo 
bla decrem,somecount ,offset2_loop 
d.pfadd.ss AIo,BRo,FRe // AIo+BRo FRo FI FR 
nop 
endit::; 
// restore regs 
fiadd.ss f0,f0,f0 //exit DIM 
fld.q O(sp) ,fl2 
fiadd.ss f0,f0,f0 //last DIM pair 


fld.g 16(sp) ,f8 
adds 52,S5p,Sp 
bri rl 
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C File: dirr.f 

C FFT - Decimation in Freq, radix-2, ae, l-dimen, - 

C REAL input 
C Intel is not responsible for use nor misuse of this code. 


C 8/14/89 

C Inputs: 

C A= REAL array of input, up to 1024 eee Single-prec float 

C as log of number of pts 

C = (Number of stages of FFT) 

C Nn = number of points. ie, N= 2**M = number of pts 

C W= complex array of twiddle factors, length N/2. 

C REV= 0 if bitreversed output ok. l=must re-order output 

C (REV will be ignored, and output will be properly ordered. Bit 
C reversal WILL be done.) 

C 

C Outputs: 

C A= complex fft of input A, but sae the positive frequency half. 
C Length = N/2+1 complex numbers. A(0:n/2) 

C 


subroutine dirr(a,m,N,W, REV) 

integer m,N, i, j,k, REV,wlimit 

integer offset, stage, groups, wincr,powers2(0:10) 
real a(N) . 

complex w(N/2) ,temp 


data powers2 /1,2,4,8,16,352,64,128, 256,512, 1024/ 
C Powers2 to avoid calls to POW, DIV 


C twiddle factor array w(k) has (coS,-Sin) of 2pi*k/N 
CC Assume the caller provides w(k) constants ALREADY initialized 
Odiiinncmememe 
C Pre=-touch data, for 8kByte fft: (2048 points real) 
IF (N .gt. 1025) THEN 
call fetch(a,%VAL(n/2) ) 


wlimit = 8*((N/2) - 1) 


C "DO 20" stage-loop: doing Complex FFT on length N/2 array. Twiddles are 
C for a length N array, so wincr gets scaled by 2. | 
DO 20 stage = 1,m-l 
groups = powers2(stage~-1) 
C groupsS=number of times the twiddle factors are used, ie, the number of 
C smaller DFTs the stage is split into. 


C offset gets N/4,N/8,N/16,... 

offset = powers2(m-l-stage) 

wincr = groups * 2 

call difstep(a,w,groups,offset,wincr,wlimit) — 
20 CONTINUE 


call bitrev(a,%VAL(M-1) ,n/2) 
call realfix(a,w,%VAL(n)) 


RETURN 
END 
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// realfix.ss: This is i860(tm) CPU assembly code to revise data from an 
// N/2 length Complex FFT. 
// (assumes the input data fed to Complex FFT was N real values) 


INTEL is not responsible for use nor misuse of this code. 


8/14/89 
This 18-cycle-butterfly loop may be sub-optimal. 


output = overwrite the data array used for input. Results are 
complex. ReQ,Im0,Rel,Iml,..., Re(N/2),Im(N/2). 
NOTE that output array is 1 element longer than input. 


Input is H(k), output is F(k)... 
F(k)=.5*( H(k)+ Heonj(N/2=-k) =-j*(H(k) -Hconj (N/2=k) ) *Wconj (k) ) 


Algorithm from "Numerical Recipes in C", by Flannery, Press, Teukolsky, and 


Vetterling, Cambridge Univ. Press 1988, p. 417. 
LL ETRE EEE EES ER EEE Ee ee 


//* The C=-version of realfix: */ void realfix_(a,w,n) 

///*Input = 

// a(Osn+1): length n/2+l complex array. Entries O:n/2-1 are the complex FFT 
// * result, in correct (NON BIT REVERSED) order. Entry n/2 is undefined. 
// * ws length n/2 complex array of twiddles. (cos,-sin(2pi*k/n) ) 

// * ns call-by-value, number of REAL input Samples 


// *Output = 

// * a(Osn+l): length n/2+l complex array. 

// * Format is ReO,Im0O,Rel,Iml,..., Re(N/2) ,Im(N/2). 

// * NOTE: To generate entire N-length complex output Spectrum, you can copy 
// * conjugate of errant) to a a 

// */ 

//float all, wll; int n;  { int aptr,bptr, wptr; float half=0.5, 
// AR,AI,BR,BI, /* input values for A,B*/ 

// PR,PI,SR,SI,DR,DI, /*temporary differences,Sums,products*/ 
// K,L,M,N, /*temporary products */ 

// ER,EI,ERD,EID, 

// FR,FI,FRD,FID, 

// WR,WI:; 


///*We do first and last elements as special et need W=(1,0))*/_ 
// AR = alo); AI = afl]; 

// alo] = AR + AI; ali] = 0; 

// aln] = AR = AI; aln+l] = 0; 
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//for(aptr=2, bptr=(n=-2), wptr=2; aptr < n/2; aptr +=2, ptr -=2, wptr +=2)_ 


//{WR = wilwptr]; WI = wlwptr+] 5 
// AR = alaptr]); AI = aflaptr+1]; 
// BR = albptrl; BI = albptr+1l]; 


// /* aptr =2,4,6...,14; bptr=30,28,26,...,18 (if n=32) */ 

// /* Note that there is no need to revise the value at the middle of the 
// list, as it is already correct. (.5*(H(n/4)+Hconj(n/4)) */ 

// SI = (AI + BI); e+ ae 

// DR = (BR = AR); 


// K = WR*SI; L= WI*DR; PR = K-L; 
// M = WR*DR; N= WI*SI; PI = M+N; 
// SR = (AR + BR); 

// DI = (AL = BI); 

// ERD = SR+PR; ER = half*ERD; 

// alaptr] = 


// EID = DI+PI; EI = half*KEID; 
// alaptr+ljJ= EI; | 

// PRD = SR=PR; FR = half*FRD; 
// albptr] = FR; 
// FID = PI-DI; FI = half*FID; 

// albptr+lj= FI; } /*end of Por=ioop */ 3} 

J [EH End of C=code for PO ALLAR ¢ FH HH He He He He Oe ae ae ae He Oe He Oe OO OF 
text 

ealign squad 


define(astart, rl6) //input data base address 


define(wptr,rl7) // pointer to W table. Because w-contents depend on N, 
// we will assume the caller has initialized w() array. 

define(N,rl8s) // . 

define(aptr, r20) //pointer to lst component of butterfly (load) | 
define(bptr, r2l) //pointer to 2nd component of bfly (load) ; DOWNCOUNTER 


define (decrem,r24) //bla decrement 
define(count,r25) // bla counter 


define(WR, £18) //W (twiddle factor), real part 
define(WI, £19) //" " , imag 


define(AR, £12) //element A, real component 

define(AI, £13) // " ", imag. 

define(ARo,fl4) // extra A value, for prefetch (o="odd") 
define (AIo, £15) 

define(BR, £16) //element B, real component 

define(BI, f17) 


define(ER, f20) //Result of butterfly which overwrites AR 
define(EI, £21) //"*" * * * AT 


define(half,f22) //constant 0.5 

define(FR, £24) //Result of butterfly which overwrites BR 
define(FI, £25) 

define (PR, f26) 

define (PI, f27) 


define (DR, f28) 
define(DI, f29) 
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define(SR, £30) //Sum of A+B, real part 
define(SI, £31) // " ", imag " 


-data 
align .double 
halfloc:: .float 0.5 


-A@lign .quad 
~realfix_:: 

fst.q £12,-16(Sp)++ //save "local" regs 

adds ~4,r0,decrem //bla decrement 
| |------- - 
// We do not bother to initialize FP pipes to zero here, as we assume 
// this routine is called after another,"safe", pipelined FP routine. 


pfid.l halfloc,f0 
pfld.d 8( wptr)++,f0 //skip W(0) intentionally. Is a trivial (1,0) value 
// init pointers: 
adds O,astart,aptr 
pfld.d 8( wptr)++,f0 
shl 2,N,bptr //bptr=total # bytes of input data 
pfld.d 8( wptr)++,half //0.5 into an fpr 
adds bptr,astart,bptr // bptr points to a(N) 


// here fetch first set of A,B,W before bla-loop 
pfld.d 8( wptr)++,WR 

fld.d 0 (aptr),AR //for lst and last elements 

adds -8,N,count // bla counter (predecrement by 2 butterflies worth) 
|] enewanennne | 
// Do nf/4 butterflies: (computing only N/2 elements of complex output, because 
// the second N/2 are just complex conjugates of the lst N/2) 


// Definitions for pipe diagram: 


// WR = cos(), WI=-sin(). 
// DR = BR — AR; (diffence of Real components of A,B) 
// ODIs 


AI = BI; (diffence of Imag components) 
//  SR,SI = sum of A,B Z 

// PR= K = L; where K= WR*SI, L=WI*DR 

// PI = M+ N3 where M= WR*DR, N=WI*SI 

// (ER,EI)=complex result to overwrite A. 

// (FR,FI)=" " " " B, 


first_fly:: //fill pipe. 

// For Oth butterfly: 

// AR = alo]; AI = al{l]; 
// alo] = AR + AI; afl] = 0; 
// aln) = AR = AI; alntl] = 0; 


// KR. .KI..M1l...-M2....M3 a Alas (cAZeweoASec sn Wr1te 


r2pt.ss f0,f0,f0 // 0 0 
mrmlp2.ss AR,AI,fO // 0 0 - ERO = - a 
mrmls2.ss AR,AI,f0 // 0 0 0 FR ER = oe 


fld.d 8 (aptr)++,AR 

fld.d <-8(bptr)++,BR : 
d.pfadd.ss f0,f0,f0 // 0 0 0 0) FR ER - 
d.pfadd.ss f0,f0,ER // 0 0 0 0 0 FR ERO 
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d.ralp2.ss AI ,BI ,FR 
nop 

d.mrmls2.ss BR ,AR ,EI 
fst.d ER,-8(aptr) 

d.mr2pt.ss WR ,f0O, FI 
fst.d FR, 8(bptr) 

d.ralp2.ss BR ,AR ,SI 
andh 0x8000,count,r0 

d.ml2tpm.ss WI ,DR ,DR 
bne endfix 

d.r2pt.ss half,DR, f0 
nop 
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// 

// 

// WR 

// Kl - 
//check for negative 
// Ll Kl 


//nalf Ml Ll. 


d.ml2ttpa.ss WI ,SI ,SR// 


nop _ 
ad.i2st.ss 
nop 


 d.ratls2.ss AI ,BI ,f0 
nop 
d.i2pt.ss 
fld.d 8 (aptr)++,AR 
d.r2apl.ss 
fld.d -8(bptr)++,BR 
d.rals2.ss SR ,PR, DI 
pfid.d 8( wptr)++,WR 


fO ,f0O ,f0// 


// KR. eKI..Ml.. «M2. ee MS 
// : - Nl 


fO ,f0, f0// 


SR ,f0, PR// 


// 


d.r2apl.ss DI ,f0, PI// 


nop — 
d.rals2.ss PI ,DI ,f0 
nop 


// 


d.ralp2.ss f0O0 ,f0O ,f0 // 


nop 


d.rals2.ss f0 ,f0O ,f0 


// 


bla decrem,count,fix_loop 


d.pfadd.ss f0 ,f0 ,FI 


// Each butterfly = 1 complx multiply, 3 complx add, 


// 


8 multiply, 10 add/subtract 
3 8=byte fetches (A, B, W) 
2 8=byte stores (E, F) 


/I 


// approx. 18 cycles per butterfly 
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DIl 


PIl 


ERD 


FRD 


PRELIMINARY 


: Al....-A2...-A5....Write 
- PR1 - 


PIl 


ERD 


1 real multiply 
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fix_loop:: // KR..KI..Ml....M2....MS Al....A2..--A3...-Write 
d.mr2pt.ss f0 ,FI ,ER // 0 FIl EIl FR1 = = - ER1 
nop 
d.mrmlp2.ss AI ,BI ,FR // 

nop 
d.mrmls2.ss BR ,AR ,EI // 

fst.d ER,-8(aptr) 
d.mr2pt.ss WR ,f0, FI // WR 

fst.d FR, 8(bptr) 
d.ralp2.ss BR ,AR ,SI // K2 - 

andh 0x8000,count,rO //check for negative 
d.ml2tpm.ss WI ,DR ,DR // L2 K2 

bne endfix 
d.r2pt.ss half,DR, f0O //half M2 L2 

nop * 
d.ml2ttpa.ss WI ,SI ,SR// 

nop | 
d.i2st.ss fO 64f0 ,f0// M2 

nop 

ff KR. KI. M1. M200. MS Aleeseh2een4cASs see Write 
d.ratls2.ss AI ,BI , f0// - - N2 DI2 PR2 - ~ 
nop 
d.i2pt.ss fO ,f0, f0// PI2 DI2 

fld.d 8 (aptr)++,AR 
d.rgapl.ss SR ,f0, PR// ERD PI2 

fld.d -8(bptr)++,BR 
d.rals2.ss SR ,PR, DI// FRD ERD 

pfld.d 8( wptr)++,WR 
d.rgapl.ss DI ,f0, PI// FRD 

nop . 
d.rals2.ss PI ,DI ,f0 // 

nop 
d.ralp2.ss f0 ,f0O ,f0 // 

nop ~% 
d.rals2.ss f0 ,f0O ,f0 // 

bla decrem,count,fix_loop 
d.pfadd.ss f0 ,f0O ,FI // 


endfix:: 

// restore regs 

fiadd.ss f0,f0,f0 //exit DIM 
fld.q O(sp) ,f12 

fiadd.ss f0,f0,f0 //last DIM pair 
adds 16,Sp,Sp 

bri rl 
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PROGRAM FFTTEST 
file = real.f 


Cc 
C 
C l-D FFT TEST PROGRAM 
C 
C 8/14/89 

C Intel assumes no reSponsibility for use or misuse of this code. 


PARAMETER (IREV=1) 
character*8 really 
PARAMETER (REALLY='real') 
c PARAMETER (REALLY='complex' ) 
PARAMETER (TIMEIT=0, CACHETIME=0) 
REALLY='real' means real-only input, otherwise assume eeue les faput 
DATA IT/200000/ | 
PARAMETER (N=2048,M=11) 
PARAMETER (N=1024,M=10) 
PARAMETER (N=512,M= 9) 
PARAMETER (N=256,M= 8) 
PARAMETER (N=128,M= 7) 
PARAMETER (N=64,M= 6) 
PARAMETER (N=32,M= 5) 
PARAMETER (N=16, M=4) 
PARAMETER (PI=3.14159265356) 
COMPLEX X2(N) ,X(N),X3(N), W(N/2) 


(o] 


Q 


0onaga 0808 0 


Real ASQR(N) ,ASQR2(N) ,XR(N+2) ,XR1(N+2) ,XR2(N+2) ,XR3(N +2) 
complex wtemp 
real rtemp 


PRINT *,' FFT test program ....' 


IF (IREV .eq. 0) THEN 

print *,'NOT counting time for bit-reversal.' 

print *,'DO NOT expect matching answers ,without biteceue 
ELSE 

print *, ‘Time for bit-reversal included. ' 
ENDIF 


print *, 'Time for cache writeback and fills...' 
IF (CACHETIME .eq. 0) THEN 
print *," NOT included, if Tee enes 


ELSE 
print * Ot ee included.’ 
ENDIF 
print: *,, ‘s======— === 
print *, ‘If iterating... Number of Iterations =',IT 
print *, ‘'s=esssssssssssssssssessssssse==' 
print *, 'Number of Points = ', N | 
print *, '(',REALLY,' data) ' 
* 


print *, “sssSs=sSsssSsSs===S==sSS==s==—" 
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ome | 


200 
cc 
ccc 
. 
C 


csc 
Cc 
Cc 


c:s 
cc 
cc 
cc 


cc 
cc 
ce 
ce 
Cs 


100 
C 
ccc 


cc 


Cc 


nit twiddle factor array w(k) with (cos,<Sin) of 2pi*k/N 
rtemp = 2.0*pi/N 
wtemp= CMPLX(cos(rtemp), =<Sin(rtemp) ) 
w(l) = (1.0, 0.0) 
DO 200 k = 2,N/2 
w(k) = wtemp * w(k-1) 

print *,' W (twiddle) initialization completed......' 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 

INITIALIZE input data 


DO 100 I=1, N 
onstant ; 
Treal 
Timag 


1.0 
0.0 
quarewave: 
IF (I .1t. N/2) THEN 


Treal = 1.0 
Timag = 0.5 


ELSE 
Treal 
Timag 

ENDIF 

ramp function: 

Treal = I = 1.0 
Timag = Treal + 0.5 
IF (REALLY .ne. 'real') THEN 
X(I) = CMPLX (Treal, Timag) 


0.0. 
0.0 


X2(I) = CMPLX (Treal, Timag) 
X3(I) = CMPLX (Treal, Timag) 
ELSE 
X(I) = CMPLX (Treal,0.0) 
X2(I) = CMPLX (Treal,0.0) 
XR(I) = Treal 
XR1(I) = Treal 
XR2(I) = Treal 
XR3(I) = Treal 
ENDIF 
CONTINUE 
CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC 


CALL fft (X2, M, N) 
Subroutine fft is Decimation=-In=-Time, Fortran version. 
CALL dirr(XR,M,N,W,1) 
(ASSuming dirr produces inplace result, items O:N/2 complex results) 
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ececcececcececcecececececececccccececccce 
IF (IREV .ne. 0) THEN. 
IF (TIMEIT .eq. 0) THEN 
call vcompare (XR,X2,N/2+2) 
call cmags (XR,N/2+1,ASQR) 
c cmags to take Squared magnitude of complex values in X 
cmags (X2,N,ASQR2) 


non-zero results: 


DO 700 I = 1,N/2+1 
IF ((ASQR(I) .GT. 1.0) .OR. (ASQR2(1I) .GT. 1.0)) THEN 
WRITE (6,22) (I-1), ASQR(I), ASQR2(I) 
22 FORMAT (' I\-1=',I4,' ASQR(I)= ',F14.2, ' ASQR2(I)= ',F14.2//) 
J = J+l 
IF (J .GIT. 32) GOTO 725 
ENDIF 
700 CONTINUE 


725 CALL TIME 
ENDIF 
ENDIF 


IF (TIMEIT .ne. 0) THEN 
ececcccccccecccccccccceccccccccccccccce 
cc= Timing loop follows: 


print *,' Start Ass.FFT' 
IF (CACHETIME .eq. 0) THEN 
DO 500 I = 1, IT,4 
C Reuse same array, So cache fill and writeback time NOT included. 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XR, M, N,W,IREV) 
500 CALL dirr(XR, M, N,W,IREV) 
ELSE 
DO 504 I=1, IT,4 
C Alternating beewoon XR,XR1L,XR2,XR3 should provide cache misses. 
CALL dirr(XR, M, N,W,IREV) 
CALL dirr(XR1l, M, N,W,IREV) 
CALL dirr(XR2, M, N,W,IREV) 
504 CALL dirr(XR3, M, N,W,IREV). 
ENDIF 
print *,' END Ass. FFT' 
eiedeocnccdeoecdaceconccoceoeecocucceus 
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Subroutine vcompare(res,exp,n) 
c VCOMPARE compares 2 vectors, prints out lst few misScompares 
c 

integer n, errcnt 

real res(n), exp(n) 


write (6,12) 
format ('*** VCOMPARE: vector comparison beginning ***') 


data errcnt/0/ 
do 30 i =l1l,n- 
if(AINT(res(i)) .me. AINT(exp(i))) then 
c {print out error, exit if alot already} 
120 . print *,'*** Error in compares ***' 
write(6,121) i 
121 format(' Item number = ',I6) 
write(6,124) res(i), exp(i) 
124 format(’' ResS.=',F1l4.2,' Expected_=' ,F14.2) 
errent = errent + l 
if (errent .gt. 19) then 
return 
end if 
end if 
50 continue 


if (errent .eq. 0) then 
190 print *,' *** vector compares SUCCESSFUL ***' 
end if : 


99 return 
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C 
C file: fft.f 


6/02/89 


Inputs: 


Outputs: 


AaAaQAaaQ aA aaaanraagaaaagaagaaa 
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FFT routine from Rabiner & Gold, (1975, who copied it 
from Cooley, Lewis, Welch 


Decimation in Time, radix-2, inplace, l-dimen 


A= complex array of input, up to 1024 pts, single=-prec float. 
(maybe more than 1024, uncertain what limit is) 
M= log of number of pts 
= (Number of stages of FFT) | . 
N = number of points. ie, N= 2**M = number of pts 


A= complex fft of yr A, in NON-bit-reversed order. 


w (twiddle factor) calculated by recursion. Supposedly takes 15% more 
operations than keeping entire twiddle array as constants pre-allocated. 


subroutine Sik m,n) 


integer m,n, 


is oe ndiv2,powers2(0: 10) 


integer iplus,offset, stage, indexl, groups 
complex a(n) ,wtemp(2) ,w(1l),temp a 


C Init twiddle factor array w() with (cos,-sin) of pi,pi/2,pi/4,... 


data w(1l) 
data w(2) 
data w(3) 
data w(4) 
data w(5) 
data w(6) 
data w(7) 
data w(8) 
data w(9) 


data w(10) 
data w(ll) /(0.9999953,=-0.003068) /.. 


/(-1.0,0.0) / 

/(0.0,<1.0) / 

/(0.7071068,-0.7071068) / 
/(0.9238795,-0.5826834) / 
/(0.9807855,-0.1950903) / 
/(0.9951847,-0.0980171) / 
/(9.9987955,-0.0490677) / 
/(0.9996988,-0.0245412) / 
/(0.9999247 ,-0.0122715) / 

/(0.9999812,-0.0061559) / 


data powers2 /1,2,4,8,16, 32,64, 128, 256,512, 1024/ 
C Powers2 to avoid calls to POW, DIV 


C Setup for bit-reversal loop: 
ndiv2 =n / 2 


j =l 


C "DO 7" loop to in-place-bit-reverse-shuffle input 
D0 7 i=l, n-l 


IF (i .1lt. 
temp 
a(j) 
a(i) 

ENDIF 

k = ndiv2 


J 


) THEN 


a(j) 
a(i) 
temp 
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"While (j .gt. k)" /*decrease j by 2**something */ 


cop) 


IF (j 


gt. k) THEN 
j = j-k 


“kok / 2 


ENDIF 
Add next 
j = j+k 


aaAaanN a 


(Perform 
groups = 
offset = 
indexl = 
C i-loop i 
CVD$ 

DO 8is 
' iplus 


temp = 
a(iplu 
8 a(i) = 


C Special case for stage 2: no complex multiplies, Simple add 


C (Perform 
groups = 
offset = 
indexl = 


GOTO 6 


lower power of 2 to j 


ance enhancement) 


Special case for stage 1: no complex multiplies, simple add 


terates N/2 times for lst stage 


NODEPCHK 
1,n,2 
=i+l1 


a(iplus) 
s) = a(i) - temp 
a(i) + temp 


ance enhancement) 


C i-loop iterates N/4 times for 2nd stage 


C 1st call to i-loop,in ayaeet indexl=1, wtemp(1)=(1,0) 


CVD$ 
DO 90 i 
iplus 
temp = 


NODEPCHK 
= l,n,4 
=i+2 

a(iplus) 


a(iplus) = a(i) = temp 


90 a(i) = 


indexl 
CVD$ 
CVD$ 

DO 92 i 

- iplus 

t 


a(i) + temp 


= 2 


NODEPCHK 
NOVECTOR 
= 2,n,4 

=i+¢+2- 


emp = CMPLX (AIMAG(a(iplus) ) ,-REAL(a(iplus) )) 


a(iplus) = a(i) = temp 
92 a(i) = a(i) + temp 


"DO 20" stage-loop executed once for each of the 


C 

C (Except 

C offset g 
DO 20 st 


2. 


VECTOR 


lst and 2nd stage) 

ets 4,8,16,32,64,128,256... 
age = 3,m 

groups = powers2(stage) 
offset = groups/2 

wtemp(l) =(1.0, 0.0) 


One twiddle seed (W) calc per stage. 
We pre-allocated w(12)-array with those values, avoid cos/sin calls 
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(and would do twice N/4 x for 2nd) 


(m) stages of FFT 


intel wane PRELIMINARY 


DO 20 indexl = l,offset 


C "DO 10" i-loop does each butterfly of each stage, with varying twiddles 
C i-loop iterates N/2 times for lst stage, N/4 x for 2nd, N/8 x for Sra 
Stage, N/16 x for 4th stage,... 1 time for last Stage. 


NODEPCHK 

ALTCODE 

DO 10 i = indexl,n,groups 
iplus = i + offset 


temp = a(iplus) * wtemp(1) — 
a(iplus) = a(i) = temp 

10 a(i) = a(i) + temp 

20 wtemp(1l) = wtemp(l) * w(stage) 


Subroutine cmags(a,n,asqr) 
C Complex magnitude squared. 
C Inputs: . 
C A= complex array of input, single-prec float 
C N = number of input points (and output points) 
C Ouput: ; ae 
C asqr = real squared magnitude (R*R + I*I), N elements, single-prec float 


integer n,i 
real asSqr(n) 
complex a(n) 


pO 100 i=l,n | 
asqr(i) = (REAL(a(i))*REAL(a(i))) + (AIMAG(a(i)) *AIMAG(a(4))) 
100 | CONTINUE 
RETURN 
END 
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## makefile for i860(tm) CPU FFTs (for Unix V/386 programming environment) 
## 8/7/89 

## 

GH=/usr/i860/bin 

GHL=/usr/i860/lib 

CC=$ (GH) /c860 

FC=$ (GH) /f860 


CFLAGS= -OLM -X393 -X405 -X188 -X370 


FFLAGS= -OLM -X370 -X393 -X71 -X422 
## -X71 uses Single-precision math routines 


FLFLAGS= -Mx map <-e Start 


LFLAGS= -Mx map <e _main 
CLIB=$(GHL) /libc.a 
MLIBPSR=$ (GHL) /860mtlib.a 


MLIB=$(GHL) /libm.a 
FLIB=$ (GHL) /libf.a 


ASM=§ (GH) /as860 
FLINK=$ (GH) /1d860 $(FLFLAGS) 
RT=$(GHL) /sSilib.a 


LIBS= $(FLIB) $(MLIBPSR) $(MLIB) $(CLIB) $(RT) 
LIBCC= $(MLIB) $(CLIB) $(RT) | 
## NOTE: Order of linked files is CRUCIAL, other orders may give errors 


~SUFFIXES: 
~SUFFIXES: .f .c .S «SS .6 8 


~ IGNORE: 
## ignore causes make to ignore error codes from compilers 


## To test Fortran plus assembler-fft-stage version: 
FILE= ffttest.o fft.o diff.o bitrev.o difstep.o Start.o time.o 


## To test all-«Fortran version of fft: 
##FILE= ffttest.o fft.o diff.o difstepf.o start.o time.o 


## To test REAL-input version of fft: 
RFILE= real.o fft.o dirr.o reaifix.o difstep.o bitrev.o start.o time.o 


“Ceo: . 
$(FC) $(FFLAGS) $*.f 
$ (ASM) =x -o $*.0 $*.s 


C208 
$(CC) $(CFLAGS) $*.c 
$ (ASM) =x -o $*.0 $*.s 
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sSe0% 
m4 $*.s temp2.s 
$(ASM) =x -o $*.0 temp2.s 
ffttest.8: $(FILE) 
$(FLINK) -o ffttest.8 $(FILE) $(LIBS) 
real.8: $ (RFILE) | | 
$(FLINK) -o real.8 $(RFILE) $(LIBS) 


clean: 
rm -f *,0 *.8 


~SS.03 
m4 $*.ss temp.s 
$(ASM) -x -o $*.0 temp.s 
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//start.ss 
// 8/18/89 
// Fortran runtime startoff routine 
// 
etext 
eZlobl start 
eglobl finish 


Start :; 
orh h%_stack+262128+262144,r0,sp 
or 1%_stack+262128+262144,sp,Sp 


adds -16,Sp,Sp 
st.l r1,12(sp) 


call ~main 
nop 

finishs:: 
call exit 
nop 


efile "start.c" 


data 
align .quad 
elcomm —_Stack,262144+262144 


-end 
/ / ams rum ees cass que come aes Gives pun frie crune Goss 20 Soma Guae Gate Sues canny Gaunr cane Gruss cious Guns Sunn Gn Guten SO enn noe Ssbt SOW Gus Caren Gone COAG NS Gin Sou! bovs ee sae” na eet ease Saat Get ne aD SON omnes So Spee Sw SD cee Se Oe SONS SURE noe Some Sea SENT 
/* file: time.c. Purpose: establish a label to use for breakpoints */ 
long time _ (x) 
long *X 5 
{ x = x+4; 


return((long) x); 


long timestop_ (x) 
long a 
{ x = x+4; 


return((long) x); 
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ideo Processor Family 7 


a ni I’ ADVANCE INFORMATION 


82750PB 
PIXEL PROCESSOR 
m@ 25 MHz Clock with Single Cycle m@ Pixel interpolator 
Execution m High Performance Memory Interface 
Zero Branch Delay — 32-Bit Memory Data Bus 
: ie — 50 MBytes per Second Maximum 
ce dade aaal wore piocense — 25 MBytes per Second with Standard 
512 x 48-Bit Instruction RAM VRAMs or DRAMs 
512 x 16-Bit Data RAM m 16 General-Purpose Registers 
Two Internal 16-Bit Buses m 4 Gbyte Linear Address Space 
ALU with Dual-Add-With-Saturation gw 132-Pin PQFP 
move = Compatible with the 82750PA 


m Variable Length Sequence Decoder 


The 82750PB is a 25 MHz wide instruction processor that generates and manipulates pixels. When paired with 
its companion chip, the 82750DB, and used to implement a DVI Technology video subsystem, the 82750PB 
provides real time (30 images/sec) pixel processing, real time video compression, interactive motion video 
playback and real time video effects. 


Real time pixel manipulations, including 30 images/sec video compression, are supported by the 25 MHz 
instruction rate. On-chip instruction RAM provides programmability for execution of a wide range of algorithms 
that support motion video decompression, text, and 2D and 3D graphics. Inner loops are optimized with the 
integration of sixteen 16-bit quad ported registers, on-chip DRAM, and two loop counters that provide zero 
delay two-way branching ‘‘free’’ in any instruction. Two, 16-bit internal buses enable two parallel register 
transfers on each 82750PB instruction, contributing to the real time performance of the video processing. 
Another feature that adds to the processing power of the 82750PB is the 16-bit ALU, which includes an 8-bit 
dual-add-with-saturate operation critical for pixel arithmetic. Other specialized features for pixel processing 
include a 2D pixel interpolator for image processing functions and a variable length sequence decoder for 
decoding compressed data. 


The 82750PB is implemented using Intel’s low-power CHMOS IV Technology and is packaged in a 132-lead 
space-saving, plastic quad flat pack (PQFP) package. 


Video Output 


i rele 
Video Input | olgtzer 
240854-1 
82750PB Subsystem Diagram 
For the complete data sheet on this device, contact Intel's Literature Distribution Dept., (800) 548-4725. 
November 1990 
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tel 82750DB 
DISPLAY PROCESSOR 


m= Programmable Video Timing 


ADVANCE INFORMATION 


-— Mix Graphics and Video Images on a 


— 28 MHz Operating Frequency Pixel by Pixel Basis 


— Pixel/Line Address Range to 4096 -—— Real Time Expansion of the Reduced 
Sample Density Video Color 
Components (U, V) to Full Resolution 


— Fully Programmable Sync, 
Equalization, and Serration 


Components — Three Independently Addressable 
— Fully Programmable Blanking and Color Palettes 
Active Display Start and Stop Times — Programmable 2X Horizontal 
-— Genlocking Capability interpolation of Y Channel 
; : . at — 16 x 16 x 2-Bit Cursor Map with 
m Flexible Display Characteristics 
—8-, Pesudo 16-, 16-, and 32-Bit/Pixel Independently Programmable 2X 
Modes Expansion Factors in X and Y 
= ; i Dimensions 
oa ae gli me a = ae — YUV to RGB Color Space Conversion 
Input Frequency — 2X Vertical Replication of Y, U, and V 
— Support Popular Display Resolutions: Data for Displaying Full Motion Video 
VGA, NTSC, PAL, and SECAM on VGA Monitor 
—On-Chip Triple DAC for Analog RGB/ — Register and Function Compatibie 
YUV Output with the 82750DA 


The 82750DB is a custom designed VLSI chip used for processing and displaying video graphic information. It 


is register and function compatible with the 82750DA. 


Reset inputs allow the 82750DB. to be genlocked to an external sync source. By programming internal control 
registers, this sync can be modified to accommodate a wide variety of scanning frequencies. A large selection 
of bits/pixel, pixels/line, and pixel widths are programmable, allowing a wide latitude in trading-off image 


quality vs update rate and VRAM requirements. 


The 82750DB can operate in a digitizing mode, wherein it generates timing and control signals to the 82750PB 
and VRAM, but does not output display information. Besides digitizer support signals and video synchroniza- 
tion, the 82750DB outputs digital and analog RGB or YUV information and an 8-bit digital word of alpha data. 
This alpha channel data may be used to obtain a fractional mix of 82750DB outputs with another video source. 


Video | 
Mixer / 
Dispiay 


| > ALPHA[7:0] 


Video 
Digitizer | 


82750DB Subsystem Diagram 


Video input 


For the complete data sheet on this device, contact Intel’s Literature Distribution Dept., (800) 548-4725. 
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Development Tools forthe 8 
80386 and 80486 _ 


INTEL386™ /i486T™ FAMILY DEVELOPMENT 
SUPPORT 


280808 -1 


COMPREHENSIVE DEVELOPMENT SUPPORT FOR THE 
INTEL386™ /i486™ FAMILIES OF MICROPROCESSORS 


The perfect complement to the Intel886T™ and i486T™ microprocessor family is the 
optimum development solution. From a single source, Intel, comes a complete, synergistic 
hardware and software development toolset, delivering full access to the power of the 
Intel386 and i486 microprocessor family architectures. 


Intel development tools are easy to use, yet powerful, with contemporary user interface 
techniques and productivity boosting features such as symbolic debugging. And you'll 
find Intel first to market with the tools needed to start development, and with lasting 
product quality and comprehensive support to keep development on-track. 


If what interests you is getting the best product to market in as little time as possible, | 
Intel is the choice. 


* AboveBoard, i486, Intel386, 386 DX, 386 SX, 376, 387, ICE, and iPAT are trademarks of Intel Corporation. 
VAX, MicroVAX and VMS are registered trademarks of Digital Equipment Corporation. 


November 1990 
8-1 . Order Number: 280808-005 


FEATURES 


¢ Comprehensive support for the full 32 bit 
Intel 386 and i486 microprocessor 
architectures—includes protected mode, 4 
gigabyte physical memory addressing, and 
1486 microprocessor on-chip cache and 
numerics — 


e Standard windowed ee ee that is common | 


across Intel debug tools and architectures 
Source line display and symbolics allow 
debugging i in the context of the original 
program 

Intel high- level languages provide 
architectural extensions for manipulating | 
hardware directly without assembly 
language routines 

A common object code format (Intel 
OMF386™) supports symbolic debug and 
permits the intermixing of modules written 
in various languages—Intel’s assembler, Cc, 
PL/M, and FORTRAN | : 


“Compile with 
Create 
and Maintain 


ASM-386 
C-386 
PL/M-386 Irae with 
"9 > | 
q 
Source 
Code: 


Ada 
Development 
Environment 


Math scat 
Libraries 


« Acommon OMF386 permits compilers and 
assembler to seamlessly operate with in- 
circuit and software debug tools 

e ROM-able code is output directly from the 
language tools, significantly reducing the 
effort necessary to integrate software into 
the final target system 


_ © Extensive support for the Intel family of 


math coprocessors 

e Operation in DOS IBM PC AT*, PS/2 Model 
60 and 80, or compatible) and VAX/ VMS"* 
hosted environments 7 


Convert to 
Hex with 


Link Modules 

Together with. 

BIND-386 5 
© 


’ Assign 
Absolute 
Addresses 
with 


- Debug with 


‘ ; Gye 


|_ 


IE] 


Execute on 
386-based PC In-Circuit 


Tools 


280808--2 


: Figure 1: Intel Microprocessor Development Environment 
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ASM 386T™ /i486T™ MACRO 
ASSEMBLER 


Intel’s ASM 386 is a “high-level” macro 
assembler for the Intel886 Family. ASM 386 
offers many features normally found only in 
high-level languages. The macro facility in 
ASM 386 saves development time by allowing 
common program sequences to be coded only 
once. The assembly language is strongly typed, 
performing extensive checks on the usage of 
variables and labels. 


Other Intel ASM 386 features include: 

e “High-level” assembler mnemonics to 
simplify the language 

e Structures and records for data 
representation 

e Support for Intel’s standard object code 
format for source-level symbolic debug, and 
for linking object modules from other 
Intel386 and i486 microprocessor languages 

e Full support for processor and math 
coprocessor instruction sets 

e A “MOD486” switch for support of the i486 
microprocessor instructions 

e 16 bit or 32 bit address overrides 

¢ Supports development for Virtual 86, Real, 
286 Protected, and 386 Protected modes 


iC386™ /i486™ COMPILER 


Intel’s iC-386 compiler provides special 
features for architectural support and code 
efficiency, for ease of use, and for compatibility 
with other Intel development toois. 


The iC-386 compiler produces code for Intel386 
and 1486™ processors from C source files, and 
conforms to the 1989 ANSI standard (ANS 

X3.159-1989) for the C programming language. 


Key Intel iC-386 features include: 

¢ Controls to tailor the compilation for each 
step of your application development process 

¢ In-line versions of many ANSI-standard 
library functions 

e Uses expanded memory (LIM Version 3.0 
and higher) 

e Object code (including supplied run-time 
libraries) suitable for ROM | 

e Three different levels of optimization 

e A choice of three segmentation memory 
models (small, compact, and flat) to create 
compact and efficient code 


¢ Object code that takes advantage of the on- 
chip cache of the 1486 processor 

In-line processor-specific functions and time- 
saving macros that provide access to the 
special features of the Intel386 and 1486 
processors 

In-line floating-point instructions for the 
387T™ numerics coprocessor and 1486 
processor floating-point unit 

e Time-saving macros and functions to help 
assembly language routines interface with 
Intel’s high-level programming languages 
The standard C run-time library plus 
libraries for floating-point support and the 
iRMX® III C interface library 

An easy interface to Intel’s non-C 
programming languages, along with object 
module compatibility between Intel C and 
non-C compilers 

Support for source-level debugging using the 
Intel DB-386 Software Debugger 
Programming with subsystems, allowing 
mixed segmentation memory models | 
Extensions to the 1989 ANSI C standard for 
compatibility with previous versions of 
Intel C 


The iC-386 libraries contain over 200 functions 

for use in iC-386 programs. The libraries and 

header files make development of iC-386 

applications easier by providing: 

e Fast and efficient functions for common 
programming tasks 

e Interfaces to standard and custom execution 
environments 

¢ Built-in versions of some functions 


PL/M386™ /i486™ COMPILER 


Intel’s PL/M-386 is a structured high-level 
system implementation language for the 
Intel386/1486 Families. PL/M-386 supports the 
implementation of protected operating system 
software by providing built-in procedures and 
variables to access the Intel386/ i486 — 
architectures. 


For efficient code generation, PL/M-386 
features four levels of optimization, a virtual 
symbol table, and four models of progten size 
and memory usage. 


FEATURES 


Other Intel PL/M-386 features include: 

e The ability to define a procedure as an 
interrupt handler as well as facilities for 
generating interrupts 

e Direct support of byte, half-word, and word 
input and output from microprocessor ports 

e Upward compatibility with Intel PL/M-286 
and PL/M-86 source code_ 

e A “MOD486” compiler switch for i486 . 
microprocessor instruction generation 


PL/M-386 combines the benefits of a high-level 
language with the ability to access the Intel386 
architecture. For the development of systems 
software, PL/M-386 is a costeffective 
alternative to assembly language 
programming. 


FORTRAN 3867 /i4867 
COMPILER 


Intel’s FORTRAN-386 compiler is a cross- 
compiler that supports the entire Intel386 
family of components and i486 (when 
operating in the 386 chip mode) 
microprocessors. 


FORTRAN-386 features high-level support for 
floating-point calculations, transcendentals, 
interrupt procedures, and run-time exception 
handling. Specifically, the FORTRAN-386 
language is a superset of the language 
described in the ANSI Fortran 77 standard. 
‘The additions to that standard include the 
Department of Defense (DOD) extensions, 
extensions that support programs written for 
the ANSI Fortran 66 standard, and extensions 
that support the 386 microprocessor and 
80387 /80387DX/80387SX math coprocessors. 


To aid in the development and debugging 
process, the compiler generates warning and 
error messages and an optional listing file. The 
listing file can include symbol cross-reference 
tables and a listing of the generated 386 
microprocessor assembly-language 
instructions. Library routines are reentrant 

~ and ROMable. . 


Other Intel FORTRAN-386 compiler features 

include: 

¢ Object code can be configured to reside in 
either RAM or ROM 

e The program code can be optimized for 
execution speed or memory size 

¢ Source-level debugging is supported via the 
rich symbolics provided in the object module 
format (Intel OMF386) 


_-@ Support for the proposed REALMATH IEEE 


floating point standard 


RLL 386T /i486™ RELOCATION, 
LINKAGE, AND LIBRARY 
TOOLS 


The RLL 386™ relocation, linkage, and 
library tools are a cohesive set of utilities 
featuring comprehensive support of the full 
Intel386T™ /i486T architectures. RLL-386 
provides for a variety of functions—-from 
linking separate modules, building an object 
library, or linking in 387™ support, to 
building a task to execute under protected 
mode or the multi-tasking, memory protected 
system software itself. Specifically, RLL-386 
supports loadable, linkable, and bootloadable 
Intel object module formats; and supports all 
segmentation models, including FLAT. Map, 
librarian, and conversion (for outputting hex 
format code for PROM programming) utilities 
are included. 


EMUL387, NUM387 NUMERICS 
SUPPORT LIBRARIES 


Intel’s EMUL-387 and NUM-387 Numerics 
Libraries fully support the 80387/80387DX/ 
80387SX math coprocessors and the i486 
internal math coprocessor—whether an actual 
math coprocessor is used i in the final system or 
not. 


For 386 microprocessor based eirolestieas 
without a math coprocessor, EMUL-387, a 
numerics software emulator, will execute 


_ instructions as though the coprocessor were 


present. Its functionality is identical to that of 
the math coprocessor. It is ideal for 
prototyping and debugging floating-point 
application software independent of hardware. 


Further, this permits portability of application 


code regardless of the presence of math 
coprocessor hardware in target systems. 


For applications with a math coprocessor, 
NUM-387 numerics support library provides 
Intel’s ASM 386, C-386, PL/M-386, and 
FORTRAN-386 language users with enhanced 
numeric data processing capability. With the 
library, it is easy for programs to do floating 
point arithmetic. Programmers can bind in 
library modules to do trigonometric, 
logarithmic and other numeric functions, and 
the user is guaranteed accurate, reliable 
results for all appropriate inputs. 


intel 


| FEATURES | ; 


Intel’s NUM-387 support library is a collection 
of four functionally distinct libraries: 

¢ Common elementary function library 
routines perform algebraic, logarithmic, 
exponential, trigonometric, and hyperbolic 
operations on real and complex numbers, as 
well as real-to-integer conversions; the 
routines extend the ranges of the coprocessor 
instructions 

Initialization library routines set up the 
numerics processing environment for 80386 
microprocessor based systems with an 
80387/80387DX/80387SX or true software 
emulator 

Decimal conversion library routines convert 
floating-point numbers from one 

80387 /80387DX/80387SX binary storage 
format to another, or from ASCII decimal 
strings to 80387/80387DX/80387SX binary 
floating-point format and vice versa 
Exception handling library routines make ~ 
writing numerics exception handlers easier 


All support library modules are in 80386 
microprocessor object module format (Intel 
OMF-386) so they can be linked with the object 
output of any Intel language. All routines are 
reentrant and ROMable. 


By using Intel’s NUM-387, the user not only 
saves software development time, but is 
guaranteed that the numeric software meets 
industry standard (ANSI/IEEE standard for 
binary floating point arithmetic, 754-1985) and 
is portable—software investment is 
maintained. 
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FEATURES | 


ONCE-386 


If you have a surface mount Intel386 SX 
microprocessor design using 100 pin PQFP 
parts, Intel ICE emulators now have “On- 
Circuit Emulation” (ONCE™) capability. With 
your part surface mounted, the ICE-386 SX 
emulator cabling clamps over the part, tri- 
stating the component, and allowing the 
emulator to operate. This allows you to debug — 
manufactured boards without resoldering. 


REM-386 


Designed to enhance your existing ICE-386 DX 
25 and ICE-386 DX 33 emulators, the REM-386 
DX Expansion board adds 2 MB of sae 
memory. 


INTEL3861T™/ i486 ™ FAMILY IN- 
CIRCUIT TOOLS 


In-Circuit Emulators 


Intel386 Family in-circuit emulators embody 
exclusive technology that gives access to 
internal processor states that are accessible in 
no other way. Intel386 microprocessors fetch 
and execute instructions in parallel, with 
fetched instructions not necessarily executing 
in order of input. Because of this, an emulator 


without this access to internal processor states | 


is prone to error in determining what actually 
occurred inside the microprocessor. With 
Intel’s exclusive technology, Intel386 Family 
emulators are one hundred percent accurate. 
In addition, internal access comes without 
signal buffer interference of processor timing. 
Operation is non-intrusive (zero wait-state). 


Other features of Intel386 Family in-circuit 

emulators include: 

e Unparalleled support of the Intel386 
architecture, notably the native protected 
mode 

e Emulation at clock speeds to 33 MHz, and 
full featured trigger and trace capabilities 

e Convertible using removable probes to 
support any of the Intel386 
microprocessors—-80386DX, 80386SX, and 
80376 microprocessors 


With symbolic debugging, memory locations 
can be examined or modified using symbolic 
references to the original program, such as 
procedure or a variable names, line numbers, 
or program labels. Source code associated with 
a given line number can be displayed, as can 
the type information of variables, such as byte, 


word, record, or array. Microprocessor data 
structures, such as registers, descriptor tables, 
and page tables, can also be examined and 
modified using symbolic names. The symbolic 
debugging information for use with Intel 


development tools is produced by Intel 
_ OMF386 compatible languages. 


ICE™.486 IN-CIRCUIT 
EMULATOR 


The ICE-486 In-circuit Emulator is the world’s 
leading tool for debugging software and . 
hardware designs based on the Intel i486 
family of microprocessors. The ICE-486 
emulator features real-time emulation at 
speeds up to 33 MHz. The standard high- level 
symbolic debug capability saves valuable 
development time. The flexible breakpoint 
capability and 8K deep trace buffer provide 
power to identify and solve even the toughest 
hardware and software bugs. The emulator | 
also provides 2 MB expansion memory to 
debug large programs. It is designed to work 
with the rich array of software development 
tools optimized for creating 32-bit applications. 


In-Circuit Debugger 


Intel’s ICD-486 represents a new generation of 
in-circuit emulation technology. From the | 
inventor of the microprocessor comes a 
development tool that delivers complete access 
to the i486 architecture. ICD-486 is the first 
development tool which allows users to debug 
high speed, cached applications at the full 
speed of the target processor. ICD-486 
embodies exclusive technology, giving users 
symbolic access to the internal processor states 


that would not be accessible in any other way. 


With Intel;s exclusive technology, users can be 
assured that the ICD-486 provides complete 
accuracy when debugging cached applications 
in real-time. 


Other Intel ICD-486 features include: 

e Real-time emulation at the full speed of the 
i486 microprocessor 

¢ Full support for the i486 on-chip caching and 
numerics 

e Ability to set up to 16 software breakpoints 

and four hardware breakpoints on execution 

addresses, data writes, or data accesses 

Full symbolic information to display and 

modify all registers of the i486 

microprocessor 
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SOFTWARE DEBUGGER 


Intel’s DB386™ is an on-host software 
execution environment with source-level 
symbolic debug capabilities for object modules 
produced by Intel’s assembler and high-level 
language compilers. For the DOS hosted 
version, this software debug environment 
allows 386 microprocessor code to be executed 
and debugged directly on a 386 DX or 386 SX 
microprocessor based PC, without any 
additional target hardware required. With 
Intel’s standard windowed human interface, 
users can focus their efforts on finding bugs 
rather than spending time learning and 
manipulating the debug environment. 


For the VMS* hosted version, the debugger 
works in conjunction with an extensive 386 
microprocessor software instruction simulator 
included with the product. This simulator 
simulates the 386 microprocessor in “flat” 
mode, 387, 8259A and 8254 interrupt 
controller and timer chips, supports map 
memory up to 4 gigabytes, and provides 
complex break, trace, and profiling support. 


FEATURES 


Other Intel DB386™ features include: | 
e A run-time interface allows protected-mode 
386 microprocessor programs to be executed 
directly on a 386 DX or 386 SX 
microprocessor based PC 

Drop-down menus make the tool easy to 
learn for new or casual users. A command 
line interface is also provided for more 
complex problems 

Watch windows (which display user-specified 
variables), trace points, and breakpoints 


(including fixed, temporary, and conditional) © 


can be set and modified as needed, even 
during a debug session | 

The user can browse source and callstacks, 
observe processor registers, and access watch 
window variables by either the pull down 
menu or by a single keystroke using the 
function keys 

An easy-to-use disassembler and single-line 
assembler speeds the debug process 

The user need not know whether a variable 
is an unsigned integer, a real, ora 
structure—the debugger uses the wealth of 
typing information available in Intel 
languages to display program variables in 
their respective type formats 

DB-386 supports the i486 microprocessor 
when operated in the 386 microprocessor 
mode 


MON386 TARGET RESIDENT 
_ SOFTWARE DEBUGGER 


Intel’s MON-386 is a hosted or unhosted target 
resident software debugger for the 386 DX and 
386 SX-based systems. MON-386 provides 
program execution control and symbolic 
processor and memory interrogation and _ 
‘modification. Hardware and software 
breakpoints can be set at symbolic addresses 
and program execution can be single-stepped 
through assembly level or high-level language 
instructions. 


Other Intel MON-386 features include: 

e Debug procedures (user-definable sequences 
of MON-386 commands) enable users to 
define macro commands that would 
otherwise take several lines of command 
entries to perform the same function 

e A disassembler/single line assembler allows 
users to display memory and patch memory 
with 80386/ 80387 mnemonics 


MON-386, used in conjunction with Intel single 
board computers iSBC® 386/22 and iSBC 386/ 
116, or other customer designed systems, can 


debug software béfore a functional prototype of 
the target system 1 is available. 


Intel’s MON -386 can be used for i486 | 
microprocessor development when the 
component is. run in the 386 microprocessor 
mode of operation. 


iPAT-3867™ PERFORMANCE 


ANALYSIS TOOL 


Intel’s iPAT-386™ performance askivaie tool 
provides analysis of real-time software 
executing on a 386-based target system. With 
iPAT-386, it is possible to speed-tune 
applications, optimize use of operating 
systems, determine response characteristics, 
and identify code execution coverage. — 


By examining iPAT-386 histogram and tabular 
information about procedure usage for critical 
functions (with the option of including 
interaction with other procedures,. hardware, 
the operating system, or interrupt service 
routines) performance bottlenecks can be 
identified. With iPAT-386 code execution 
coverage information, the completeness of 
testing can be confirmed. 


Intel’s iPAT-386 provides real-time analysis up 
to 20 MHz, performance profiles of up to 125 
partitions, and code execution coverage 
analysis over 252K. The iPAT-386 target probe .- 
is used with the same iPAT base module 
supporting 80286, 80186, and 8086 
development. The iPAT-386 system can be 
used independently or piggy-backed with 
Intel386 in-circuit emulator tools. 


WORLDWIDE SERVICE, 
SUPPORT, AND TRAINING 


To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support and on-site 
service. 


~ Intel also offers a Software Support package 
which includes technical software information, 


telephone support, automatic distribution of 
software and documentation updates, access to 
the “ToolTalk” electronic bulletin board, 
“itComments” publication, remote diagnostic 
software, and a development tools 
troubleshooting guide. 
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PRODUCT SUPPORT MATRIX 


| Component | Host 
Lo ei ated 
DX SX 5.1+ 


FORTRAN-386 Compiler 
RLL-386 Relocation, wv 
Linkage, Library, Support 

Tools 


MON-386 Target Level 
Software Debugger 
iPAT-386 Performance 
Analysis Tool 


Es 
ae | 
a 
La 


8-9 


386TM /i486™ FAMILY DOS 
HOSTED DEVELOPMENT KIT 
ORDER CODES 


DKIT386C 


DKIT386CS 
DKIT386CIDX__ 


DKIT386CIDXS 


- pDKIT386CISX 


pDKIT386CISXS 


pDKIT386C1376 


pDKIT386CI376S 


Compiler Software 
Development Kit (see 
following content list). Also 
supports i486 
microprocessor 


C Compiler Software 
Development Kit w/ one 
year Gold Software 
Support. Also supports 1486 
microprocessor. 


C Compiler Software 
Development Kit w/ ICE386 
DX 33 MHz In-circuit 
Emulator and 2 MB 
AboveBoardT™ 


Same as above w/ one year 
Hardware and Gold 
Software Support 


C Compiler Software 
Development Kit w/ ICE386 


SX 20 MHz In-circuit 


Emulator and 2 MB Above 


Board |. 


Same as above w/ one year 
Hardware and Gold 
Software Support 


C Compiler Software 
Development Kit w/ ICE3876 
16 MHz In-circuit Emulator 
and 2 MB Above Board 


Same as above w/ one year 
Hardware and Gold 
Software Support 


The Intel Basic eSéniwace Development Kit for. 
the DOS hosted environment includes: 


iC386 compiler 


ASMS386 assembler 

RLL386 relocation linker and locator 
(builder/binder) 

NUM8387 numerics library 

EMUL3887 math coprocessor emulator 


library 


DB386 software debugger 
OMFS886LOAD loader development pia: 
module format documentation 


3867 /i486T™M FAMILY VAX AND 
MICROVAX/VMS* HOSTED 
DEVELOPMENT KIT ORDER 


CODES 
MVVSC386KIT 


MVVSP386KIT 


MVVSF386KIT 


— VVSC386KIT 


VVSP386KIT 


VVSF386KIT 


MicroVAX/VMS C386 
compiler, RLL386 relocation 
linker and locator, ASM386 
assembler, DB386 software 
debugger 


MicroVAX/VMS PL/ M386 
compiler, RLL386, ASM386, 
DB386 


MicroVAX/VMS 
FORTRAN386 compiler, 
RLL386, ASM386, DB386 


VAX/VMS C386, RLL386, 


ASM3886, DB386 


VAX/VMS PL/M386, 
RLL386, ASM386, DB386 


VAX/VMS FORTRANS386, 
RLL386, ASM386, DB386 


ADDITIONAL 386™« /i486™ 


FAMILY DEVELOPMENT TOOL 


ORDER CODES 


ICD48625D 


ICD486CON33D 


iPATCORE 


iPAT386DOS 
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- 25 MHz In-circuit Debugger 


for the i486 microprocessor 


ICD48625D with a prepaid 
upgrade to 33 MHz 


iPAT Performance Analysis 
Tool base unit 


iPAT 80386 probe kit 
including PC-DOS 3.x 
software, requires 
iPATCORE 


COMPREHENSIVE DEVELOPMENT SUPPORT FOR THE 


280903-1 


INTEL 376™ EMBEDDED PROCESSORS 


The perfect complement to the Intel 376™ embedded processor is the optimum 
development solution. From a single source, Intel, comes a complete, synergistic 
hardware and software development toolset, delivering full access to the power of the 


Intel 376 architecture. 


Intel development tools are easy to use, yet powerful, with ease-of-use user interface 
techniques and productivity boosting features such as symbolic debugging. And you’ll 
find Intel first to market with the tools needed to start development, and with lasting 
product quality and comprehensive support to keep development on-track. 


If what interests you is getting the best product to market in as little time as possible, 


Intel is the choice. 


FEATURES 


e Full speed emulation up to 20 MHz 
e Source line display and symbolics allow 
debugging in the context of the original 

program 

e Intel high-level languages provide 
architectural extensions for 
manipulating hardware directly without 
assembly language routines 

e A common object code format (Intel 
OMF'386) supports symbolic debug and 
permits the intermixing of modules 
written in various languages—Intel’s 
assembler, C, PL/M, and FORTRAN 
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A common OMF886 permits compilers 
and assembler to seamlessly operate 
with in-circuit and software debug tools 
ROM.-able code is output directly from 
the language tools, significantly reducing 
the effort necessary to integrate software 
into the final target system 

Extensive support for the Intel 80387SX 
math coprocessor 

Operation in DOS IBM PC AT*, PS/2 
Model 60 and 80, or compatible) and 
VAX/VMS* hosted environments 


November 1990 
Order Number: 280903-001 


Create 
and Maintain 
Libraries with : 


Lib-386. 
© . 
i 
Ada 
Development EMUL-387 
Environment © 


Math Coprocessor 
Libraries 


Source 
Code 


| 386-based PC 


Convert to 
Hex with 


Link Modules 
Together with 
BIND-366 
© 


Assign 

Absolute 

Addresses 
. with 


Execute on 
In-Circuit 
Tools 
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Figure 1: Intel Microprocessor Development Environment 


ASM 386™ MACRO ASSEMBLER 


Intel’s ASM 3867 is a “high-level” macro 
assembler for developing 376 based embedded 
system. ASM 386 offers many features 
normally found only in high-level languages. 
The macro facility in ASM 386 saves 
development time by allowing common 
program sequences to be coded only once. The 
assembly language is strongly typed, 
performing extensive checks on the usage of 
variables and labels. 


Other Intel ASM 386 features include: 

e “High-level” assembler mnemonics to 
simplify the language 

e Structures and records for data 
representation 

e Support for Intel’s standard object code 
format for source-level symbolic debug, and 
for linking object modules from other 
Intel386/376 languages 

e Full support for processor and math 
coprocessor instruction sets 

e 16 bit or 32 bit address overrides 

¢ Supports development for Virtual 86, Real, 
286 Protected, and 386 Protected modes 


iC-386 COMPILER —— 

Intel’s iC-386 compiler provides special | 
features for Intel 376 architectural support 
and code efficiency, for ease of use, and for 
compatibility with other Intel Sevepmen 
tools. 


The iC-386 compiler produces code for Intel 
376 processor from C source files, and conforms 
to the ANSI standard (ANS X3.159-1989) for 
the C programming language. 


Key Intel iC-386 features include: 


_ * Controls to tailor the compilation for each 


step of your application development process 

¢ In-line versions of many ANSI-standard 
library functions 

e Uses expanded memory (LIM Version 3.0 
and higher) | 

e Object code (including supplied run-time 
libraries) suitable for ROM 

e Three different levels of optimization 

e A choice of three segmentation memory 
models (small, compact, and flat) to create 
compact and efficient code 

¢ In-line processor-specific functions and time- 
saving macros that provide access to the 
special features of the Intel 376 embedded 
processor 
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¢ In-line floating-point instructions for the 
387T™ SXTM numerics coprocessor 

e Time-saving macros and functions to help 
assembly language routines interface with 
Intel’s high-level programming languages 

e The standard C run-time library plus 
libraries for floating-point support 

e An easy interface to Intel’s non-C 
programming languages, along with object 
module compatibility between Intel C and 
non-C compilers 

e Support for source-level debugging using the 
Intel DB-386 Software Debugger 

¢ Programming with subsystems, allowing 
mixed segmentation memory models 

e Extensions to the ANSI C standard for 
compatibility with previous versions of 
Intel C | 


The iC-386 libraries contain over 200 

functions. The libraries and header files make 

development of iC-386 applications easier by 

providing: . 

e Fast and efficient functions for common 
programming tasks | 

e Interfaces to standard and custom execution 
environments 

¢ Built-in versions of some functions 


PL/M-386 COMPILER 


Intel’s PL/M-386 is a structured high-level 
system implementation language for the Intel 
376 embedded processor. PL/M-386 supports 
the implementation of protected operating 
system software by providing built-in 
procedures and variables to access the Intel 
376 architecture. 


For efficient code generation, PL/M-386 
features four levels of optimization, a virtual 
symbol table, and four models of program size 
and memory usage. 


Other Intel PL/M-386 features include: 

¢ The ability to define a procedure as an 
interrupt handler as well as facilities for 
generating interrupts 

¢ Direct support of byte, half-word, and word 
input and output from microprocessor ports 

¢ Upward compatibility with Intel PL/M-286 
and PL/M-86 source code 


PL/M-386 combines the benefits of a high-level 
language with the ability to access the Intel386 
architecture. For the development of systems 
software, PL/M-386 is a cost-effective 
alternative to assembly language 
programming. 


FORTRAN-386 COMPILER 


Intel’s FORTRAN-386 compiler is a cross- 
compiler that supports the entire Intel 376 
embedded processor. 


FORTRAN-386 features high-level support for 
floating-point calculations, transcendentals, 
interrupt procedures, and run-time exception 
handling. Specifically, the FORTRAN-386 
language is a superset of the language 
described in the ANSI Fortran 77 standard. 
The additions to that standard include the 
Department of Defense (DOD) extensions, 
extensions that support programs written for 
the ANSI Fortran 66 standard, and extensions 
that support the 376 processor and 80387SX 
math coprocessors. 


To aid in the development and debugging 
process, the compiler generates warning and 
error messages and an optional listing file. The 
listing file can include symbol cross-reference 
tables and a listing of the generated 386 
microprocessor assembly-language 
instructions. Library routines are reentrant 
and ROMable. 


Other Intel FORTRAN-386 compiler features 

include: | 

¢ Object code can be configured to reside in | 
either RAM or ROM 

e The program code can be optimized for 
execution speed or memory size 

* Source-level debugging is supported via the 
rich symbolics provided in the object module 
format (Intel OMF386) 

¢ Support for the proposed REALMATH IEEE 
floating point standard 


RLL-386 RELOCATION, 
LINKAGE, AND LIBRARY 
TOOLS 


~The RLL-386 relocation, linkage, and library 


tools are a cohesive set of utilities featuring 
comprehensive support of the full Intel 376 
architecture. RLL-386 provides for a variety of 
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functions—from linking separate modules, 
building an object library, or linking in 387 SX 
support, to building a task to execute under 
protected mode or the multi-tasking, memory 
protected system software itself. Specifically, 
RLL-386 supports loadable, linkable, and 
bootloadable Intel object module formats; and 
supports all segmentation models, including 
FLAT. Map, librarian, and conversion (for 
outputting hex format code for PROM 
programming) utilities are included. 


EMUL-387, NUM-387 NUMERICS 
SUPPORT LIBRARIES 


Intel’s EMUL-387 and NUM-387 Numerics 
Libraries fully support the 80387SX math 
coprocessor whether an actual math | 
coprocessor is used in the final system or not. 


- For 376 microprocessor based applications 
without a math coprocessor, EMUL-387, a 
numerics software emulator, will execute 
instructions as though the coprocessor were 
present. Its functionality is identical to that of 
the math coprocessor. It is ideal for | 

- prototyping and debugging floating-point 
application software independent of hardware. 
Further, this permits portability of application 
code regardless of the presence of math 
coprocessor hardware in target systems. 


For applications with a math coprocessor, 
NUM-387 numerics support library provides 
Intel’s ASM 386, C-386, PL/M-386, and 
FORTRAN-386 language users with enhanced 
numeric data processing capability. With the 
library, it is easy for programs to do floating 
point arithmetic. Programmers can bind in 
library modules to do trigonometric, 
logarithmic and other numeric functions, and 
the user is guaranteed accurate, reliable 
results for all appropriate inputs. 


Intel’s NUM-387 support library is a collection 
of four functionally distinct libraries: 

¢ Common elementary function library 
routines perform algebraic, logarithmic, 
exponential, trigonometric, and hyperbolic 
operations on real and complex numbers, as- 
well as real-to-integer conversions; the 
routines extend the ranges of the coprocessor 
instructions : / —_ 
Initialization library routines set up the 
numerics processing environment for 80376 
microprocessor based systems with an 
80387SX or true software emulator | 
Decimal conversion library routines convert 
floating-point numbers from 80387SX binary 
storage format to ASCII decimal strings to 
80387SX binary and vice versa 

Exception handling library routines make 
writing numerics exception handlers easier 


All support library modules are in 80386 
microprocessor object module format (Intel | 
OMF386) so they can be linked with the object 
output of any Intel 80386 language. All 
routines are reentrant and ROMable. 


By using Intel’s NUM-387, the user not only 
saves software development time, but is 
guaranteed that the numeric software meets 
industry standard (ANSI/IEEE standard for 
binary floating point arithmetic, 754-1985) and 
is portable—software investment is 
maintained. 
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INTEL 3761 IN-CIRCUIT 
EMULATORS 


Intel 376 In-circuit Emulators embody 
exclusive technology that gives access to 
internal processor states that are accessible in 
no other way. Intel 376 processor fetch and 
execute instructions in parallel, with fetched 
instructions not necessarily executing in order 
of input. Because of this, an emulator without 
this access to internal processor states is prone 
to error in determining what actually occurred 
inside the microprocessor. With Intel’s 
exclusive technology, Intel 376 emulator is one 
hundred percent accurate. In addition, 
internal access comes without signal buffer 
interference of processor timing. Operation is 
non-intrusive (zero wait-state). 


Opening the Door to Protected Mode 


The Intel 376 In-circuit Emulator opens the 
door to the full potential of the architecture 
with unparalleled support of protected mode. 
The emulator can display and modify task 
state segments and global, local, and interrupt 
descriptor tables (with symbolic access to all 
descriptor components like privilege level and 
segment type). Emulation memory of 128 
Kbytes or the optional 2 Mbytes of relocatable 
expansion memory can be used instead of 
target memory for code debugging. 


With symbolic debugging, memory locations 
can be examined or modified using symbolic 
references to the original program, such as 
procedure or a variable names, line numbers, 
or program labels. Source code associated with 
a given line number can be displayed, as can 
the type information of variables, such as byte, 
word, record, or array. Processor data 
structures, such as registers, descriptor tables, 
and page tables, can also be examined and 
modified using symbolic names. The symbolic 
debugging information for use with Intel 
development tools is produced by Intel 
OMF386 compatible languages. 


Flexible and Versatile Event Recognition 


Flexibility and versatility in event recognition 
makes short work of uncovering the most 
complex bugs. Bus even recognition circuitry 
may be used to trigger on specific or masked 
data input, output, read, written, or fetched at 
a physical address or range of addresses. Or on- 
chip debug registers may be used to trigger on 
virtual, linear, or symbolic addresses being 
executed, accessed, or written. 


Versatility shows in other triggering options— 
upon a task switch, an external signal from 
another emulator or a logic analyzer, multiple 
occurrences of an event, a full trace buffer, 
halt or shutdown cycles, or interrupt 
acknowledge. And up to four sequential event 
triggers can be combined with a high-level 
construct. 


The Intel 376 In-circuit Emulator continuously 
captures all bus activity and, as an option, 
execution information, into a trace buffer of 
4K frames with PRE, POST, and CENTERED 
collection modes. The contents of the trace 
buffer can be displayed during full speed 
emulation in either execution cycle or 
machine-level instruction formats. 


Accessing the Power 


The power of the Intel 376 In-circuit Emulator 
is reflected in the sophisticated user interface. 
Refined for ease-of-use, the command line 
interface contains many features to boost 
productivity and customize functionality. 


On-line help, a syntax menu, command line 
editing, command history, and error message 
query promote ease of learning and use. I/O 
redirection and the ability to escape the host 
operating system provide versatility for the 
power user. Customized procedures with 
variables and literal definitions can be created 
to assist in debugging or for manufacturing _ 
test or field service applications. 


SOFTWARE DEBUGGER 


Intel’s DB386 is a useful tool for early software 
algorithm debug. It is an on-host software 
execution environment with source-level 
symbolic debug capabilities for object modules 
produced by Intel’s assembler and high-level 
language compilers. For the DOS hosted 
version, this software debug environment 
allows 386 microprocessor code to be executed 
and debugged directly on a 386 DX or 386 
SXTM microprocessor based PC, without any 
additional target hardware required. With 
Intel’s standard windowed human interface, 
users can focus their efforts on finding bugs 
rather than spending time learning and 
manipulating the debug environment. 


INTEL 376™ FAMILY DEVELOPMENT SUPPORT 


Other Intel DB386 features include: . ° The user can browse source and callstacks, 


¢ A run-time interface allows protected-mode observe processor registers, and access watch 
376 microprocessor programs to be executed window variables by either the pull down 
directly on a 386 DX or 386 SX | menu or by a single keystroke using the 
microprocessor based PC function keys : 

¢ Drop-down menus make the. tool easy to e An easy-to-use disassembler and single-line 
learn for new or casual users. A command assembler speeds the debug process 
line interface is also provided for more | ¢ The user need not know whether a variable 
complex problems is an unsigned integer,areal,ora_ - 

¢ Watch windows (which display user-specified structure—the debugger uses the wealth of 
variables), trace points, and breakpoints typing information available in Intel 
(including fixed, temporary, and conditional) languages to display program variables in 
can be set and modified as needed, even their respective type formats — | 


during a debug session 


ICET™.-376 SPECIFICATIONS AND REQUIREMENTS 


HOST SYSTEM REQ UIREMENTS 


The user supplied host system can be either an IBM® PC AT® or Personal System/2® Model 60. 
Host system requirements to run the emulator include the following: 


¢ DOS version 4.0, or Hewlett-Packard HP9000  ° A serial port or the National Instruments 
UNIX GPIB-PCIIT™, GPIB-PCIIA™, or 
¢ 640 Kbytes of RAM in conventional memory MC-GPIB™ board | 
¢ An Above™ board with 1 megabyte of RAM = ¢© A mathcoprocessor if either the optional 
configured in expanded memory mode, time tag board is used or if a math | 
EMM.SYS software version 3. 2 coprocessor resides on the target system 
¢ A 20 MB hard disk | | : 


ELECTRICAL — | ENVIRONMENTAL 


CHARACTERISTICS - CHARACTERISTICS | 

100-120V or 220-240V selectable _ Operating Temperature: + 10°C to'+ 40°C 
50-60 Hz : (50°F to 104°F) : 
2 amps (AC max) @ 120V 


Operating Humidity: Maximum of 85% 
1 amp (AC max) @ 240V relative humidity, non-condensing 


The Emulator’s ee Characteristics 


a 


Base Unit 
Processor Module | 
Optional Isolation Board 


Power Supply 

User Cable 

100-Pin Target-Adapter Cable 
88-Pin Target-Adapter Cable 
Serial Cable | 

Optional Clips Pod 
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The Processor Module and Bus Isolation Board Dimensions (88 Pin PGA) 
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ELECTRICAL SPECIFICATIONS _ arestandard TTL inputs. The synchronization 

| output lines are driven by TTL open collector 
outputs that have 4.7K-ohm pull-up resistors. 

_ The synchronization input and output signals 
on the optional clips pod are standard TTL 
input and outputs. 


AC Specifications With the Bus Isolation Board Installed. 


[embarrass [Rot 


CLK2 period 50 nS 

CLK2 high time © t2a Min+2nS 

CLK2 low time es t3b Min+ 2 nS | 
A1-A23 validdelay > — |t6Min+3.5nS |t6Maxt+24.6nS |CL=120 pF 
A1-A23 float delay t14 Min+5.5nS |t14 Max+37.6nS 
BLE#, BHE# LOCK # valid delay t8 Min+3.5nS |t8 Max+ 24.6 
BLE#, BHE# LOCK # float delay t14Min+5.5nS |t14 Max+ 37.6 
W/R#, M/IO#, D/C#, ADS# valid delay |t10 Min+3.5nS |t10 Min+ 24.6 
W/R#, M/IO#, D/C#, ADS# float delay |t14 Min+5.5nS |t14 Max+ 37.6 
DO-D15 write data valid delay t12 Min+4.5nS |t12 Max+ 20.6 
DO-D15 write data floatdelay . 7.5nS — 456nS 
HLDA valid delay t14 Min=3nS t14 Max + 21.2nS 
NA# hold time t16 Min+ 10.6 nS | , 
READY # hold time t20 Min+ 10.6 nS 

DO0-D15 read setup time t21 Min+ 8.5 nS 

DO-D15 read hold time t22 Min+ 7.6 nS 

HOLD hold time t24 Min+ 10.6 nS 

RESET setup time t25 Min+ 2.1 nS 

RESET hold time | t26 Min+ 2.1 nS 

NMI, INTR hold time —-1t28 Min + 10.6 nS 

PEREQ, ERROR#, BUSY# holdtime | t30 Min+10.6 nS 


The synchronization input lines must be valid — 
for at least four CLK2 cycles as they are only 
sampled on every other cycle. These input lines 
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Emulator Capacitance Specifications 
With Target-Adapter Cable Installed 


Typical 
we Description (Note 1) 


Input Capacitance 
CLK2 
READY #, ERROR # 
HOLD, BUSY #, PEREQ, NA#, 
INTR, NMI 
RESET 
Output or I/O Capacitance 
-D15-D0 
A15-Al1, BLE# 
A23-A16, BHE#, D/C# 
HLDA, W/R# 
ADS#, M/IO#, LOCK #35pf 


Note 1: Not tested. These specifications include the 80376 component and 
all additional emulator loading. 


Emulator DC Specifications 
Without the Bus Isolation Board Installed 


[item [Desciption Sd Mas 


PM-Icc | Processor Module Supply Current 376-I¢¢ + 
940 mA 


Input High Leakage Current . 
A23-A1, BLE#, BHE#, D/C#, HLDA | 0.02 mA 


D15-D0 | 0.06 mA 
ADS#, M/IO#, LOCK#, READY#, | 

ERROR # 0.01 mA 
W/R# 0.03 mA 
CLK2 0.04 mA 
RESET 0.06 mA 


Input Low Leakage Current 
A23-A1, BLE#, BHE#, D/C# 


0.6 mA 


D15-D0 0.06 mA 
ADS#, M/IO#, LOCK#, READY #, 

ERROR# 0.01 mA 
W/R# 0.51 mA 
CLK2 | 0.62 mA 
RESET 0.6mA 


HLDA 0.02 mA 


Note 1: This specification is the DC input loading of the emulator circuitry only and does not 
include any 80376 leakage current. 
Note 2: This specification replaces the 80376 specification for this signal. 
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Emulator DC Specifications With the Bus Isolation Board Installed _ 
Max. 


Item Description 


Bus Isolation Board Supply Current 


Output Low Voltage (Io, = 48 mA) 
A23-A1, BLE#, BHE#, D/C#, ADS# 
D15-D0, M/IO#, LOCK #, W/R# 
HLDA (Ip; = 24 mA) 

Output High Voltage Toy = 3 mA) 
A23-A1, BLE#, BHE#, D/C#, ADS# 
D15-D0, M/IO#, LOCK #, W/R# 


HLDA (I9H = 24 mA) 
Input High Current 
-CLK2, RESET 

READY # 

Input Low Current 

CLK2, RESET 

READY # 


Output Leakage Current 


PROCESSOR MODULE 
INTERFACE CONSIDERATIONS 


With the processor module directly attached to 

the target system without using the bus 

isolation board, the target system must meet 

the following requirements. | 

e The user bus controller must only drive the 
data bus during a valid read cycle of the 
emulator processor or while the emulator 
processor is in a hold state (the emulator 
processor uses the data bus to communicate 
with the emulator hardware). 
Before driving the address bus, the user _ 
system must gain control by asserting HOLD 
and receiving HLDA. : 
The user reset signal is disabled during the 
interrogation mode. It is enabled in 
emulation, but is delayed by 2 or 4 CLK2 

_cycles. 

e The user system must be able to drive one 

additional TTL load on all signals that go to 
the emulation processor. 


A23-A1, BLE#, BHE#, D/C#, ADS# 
D15-D0, M/IO#, LOCK #, W/R# 


When the target system does not satisfy the 
first two restrictions, the bus isolation board is 
used to isolate the emulation processor from 
the target system. With the isolation board 
installed, the processor CLK2 is restricted to 
running at 20 MHz. 


The processor module derives its DC power 
from the target system through the 80376 
socket. It requires 1400mA, including the 
80376 current. The isolation board requires an 
additional 350mA. : 


The processor must be socketed, for example 
using Textool 2-0100-07243-000 or AMP 
821949-4 sockets. 


The printed circuit board design should locate 
the processor socket at the physical ends of the 
printed circuit board traces that connect the 
processor to the other logic of the target 
system. This reduces transmission line noise. 
Additionally, if the target system is enclosed in 
a box, pin one of the processor socket should be 
oriented away from the target system’s box _ 
opening to make connecting the target-adapter 
cable easier. 
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| ORDERING INFORMATION | | 


SERVICE, SUPPORT, AND 
TRAINING 


To augment its development tools, Intel offers 
a full array of seminars, classes, and 
workshops, field application engineering 
expertise, hotline technical support, and on- 
site service. 


Intel also offers a Gold Software Support 
package which includes: 


Technical software information phone 
support, 
Automatic distribution of software and 


documentation updates 


Access to the ““ToolTalk” electronic 
bulletin board 


Intel’s Hardware Support package includes: 


Technical hardware information phone 
support, 

Warrantee on parts/labor/material 
On-site hardware support 

One Customer Training Course of choise, 
plus discounts on additional customer 
training and SE consulting 


SOFTWARE DEVELOPMENT 
TOOL 


Order Code 
D86ASM386NL 


D86C386NL 
D86PLM386NL 
D86FOR386NL 


D86RLL386NL 


DASM386PLUS 


DB386 
D86NUM387NL 


EMUL387SU 


Description 


DOS Macro Assembler 
supports 80386/80376 


DOS C Compiler supports 
80386/80376 


DOS PL/M Compiler 
supports 80386/80376 


DOS FORTRAN Compiler 
supports 80386/80376 


DOS S/W DEV Package 
Builder/Binder/Mapper/ 
Librarian. Supports 80386/ 
80376 


DOS ASM Developers Kit, 
Inc. ASM, NUM, RLL and 
EMUL 


DOS S/W Debugger for 
80386 


DOS 80387 Numerics 
Libraries 


387 Numerics Coprocessor 
S/W Emulator object code 


EMUL387RO 


EMUL387RF _ 


EMUL-387 to form 
derivative works. 
Requires incorporation 
fee 


Requires prior purchase 
of EMUL-387RF 


~ EMUL-387 one time 


incorp fee 


IN-CIRCUIT EMULATOR 


pICE376D 


ICE37620D 


REM386SX376 


pICE386TO376D 


pICE386SXTO376D 
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ICE376 In-circuit 
Emulator for 80376 
component. Operates to 
16 MHz. Includes control 
unit, power supply, 376 
Processor Module with 
PQFP adaptor, Stand- 
Alone Self-Test board, bus 
Isolation Board, and DOS 
3.x host software and 
interface cable. 


ICE376 In-circuit 
Emulator for 80376 
component. Operates to 
20 MHz. Includes control 
unit, power supply, 376 
Processor Module with 
PQFP adaptor, Stand- 
Alone Self-Test board, bus 
Isolation Board, and DOS 
3.x host software and 
interface cable. 


2 Mbytes relocatable 
expansion memory 


Conversion kit to adapt 
ICE386 25 MHz emulator 
to support the 80376 
component. Operates to 
16 MHz. Includes ICE376 
emulator Processor 
Module and DOS 3.x host 
software. 


Conversion kit to adapt 
ICE386SX 16 or 20 MHz 
emulator to support the 
80376 component. 
Operates to 16 MHz. 
includes ICE376 emulator 
Processor Module and 
DOS 3.x host software. 


p88PGAADAPT 


pICE38XXCPO 


ORDERING INFORMATION 


Adaptor for ICE3876 
emulator to support 88 
pin PGA component 
packaging. 


Clips Pod Option for 
ICE376, ICE386SX 16 or 
20 MHz, ICE386 25 MHz, 
and ICE386DX 33 MHz 
emulators. 


pICE3XXTTB _ Time Tag Board Option 
for ICE376, ICE386SX 16 
or 20 MHz, ICE386 25 
MHz, and ICE386DX 33 
MHzemulators. 


DTOAB 2 MB Intel Above Board. 
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i486™ MICROPROCESSOR IN-CIRCUIT DEBUGGER 


280872-1 


Intel’s ICD-486/25, the in-circuit debugger for the 25 MHz i486™ microprocessor, 


represents a new generation of in-circuit emulation technology. From the inventor of the 
microprocessor comes a development tool that delivers complete access to the 1486 
architecture. ICD-486/25 is the first development tool which allows users to debug high- 
speed, cached applications at the full speed of the target processor. [CD-486/25 embodies 
exclusive technology, giving users symbolic access to the internal processor states that 
would not be accessible in any other way. With Intel’s exclusive technology, users can be 


assured that the ICD-486/25 provides complete accuracy when debugging cached 


applications in real-time. 


FEATURES 


¢ Real-time emulation at the full speed of 
the i486 microprocessor 

¢ Full development and debug support for 
the i486 microprocessor on-chip caching 
and numerics 

e Programming support for the 1486 
microprocessor real mode and native 
protected modes 

¢ Non-intrusive operation, allowing the 
target system to be debugged without 
modification 


e Ability to set up to sixteen software 
breakpoints and four hardware 
breakpoints on execution addresses, data 
writes, or data accesses 

e Sync in and out lines for connecting an 
ICD-486/25 to a high-speed logic 
analyzer to provide trace informatio 
and bus breakpoints | 

e Provides full symbolic information to 
display and modify all registers of the 
i486 microprocessor 


November 1990 


8-23 Order Number: 280872-002 
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[FEATURES | = 


_ FULL-SPEED DEBUG AND 
| DEVELOPMENT — 


The ICD-486/25 In-circuit Debugger provides 


sophisticated real-time hardware and software 


debug capabilities for i486 microprocessor 
based designs. The user can run at the full 
speed of the target processor, ensuring that 


elusive timing bugs will be found. And, because : 


the ICD-486/25 is non-intrusive, your target 
system being developed can be the s same as 
your final target system. 


DEBUG CACHED 
APPLICATIONS 


Until now, it has been extremely difficult to 
accurately debug high-speed, cached | 
microprocessor applications. However, by 
incorporating Intel’s exclusive technology, the 


ICD-486/25 allows users to debug applications | 


which use the on-chip caching features of the 

i486 microprocessor. ICD-486/25 provides 

complete debugging accuracy whether the — 
cache i is on or off. 3 


IDEAL FOR ALL s TA GES OF 
DEVELOPMENT 


The ICD-486/25 can be used by both hardware 


and software developers, at any stage of design. 


Early in the development process the ICD-486/ 
25 allows prototype development and software 
debugging when using the optional REM — 
board. Later in the design cycle, the ICD-486/ 
25 can be used to integrate hardware ane 
software modules. 


SPEEDING DEVEL Cr eeeN T 
WITH SYMBOLICS | 


With symbolic debugging, memory locations 
can be examined or modified using symbolic 
references to the original program, such as a 
procedure or variable name, line number, or 
program label. Microprocessor data structures, 
such as registers, descriptor tables, and page 
tables, can also be examined and modified 
using symbolic names rather than 
cumbersome linear or physical addresses. 
Optimal symbolic debugging can be achieved 
when using the ICD-486/25 with Intel 
languages. 


THE COMPLETE STORY 


For advanced hardware debugging, the ICD- 


486/25 has been designed to work with high- 


speed logic analyzers. The standard ICD-486/ 
25 ships with a Logic Analyzer Interface (LAI) 
board providing access to all chip signals which 
may be used to trigger a logic analyzer. With a 
user-supplied interface, the ICD-486/25 and 
logic analyzer can work in combination to 


-monitor and recognize bus activity. 


SOFTWARE COMPLETES THE 


SYSTEM 


- Intel provides a comprehensive software 


development environment to complement the 


ICD-486/25, delivering the most complete 
. 82-bit microprocessor development 
~ environment available from a single vendor. 
~ Intel’s i486 software development tools for the 


386T™ and i486 microprocessor families — 
include 32-bit ANSI C, FORTRAN 77, and 
PL/M compilers, as well as 32 bit assembly 
language, linkage, IEEE math, run-time 
libraries and system software builders with 


_ full access to all aspects of the i486 
microprocessor. In addition, all translators are 


object code compatible. Architectural 
extensions in the high-level languages allow 
hardware features such as interrupts, input/ 


_ output or flags to be controlled directly, 


avoiding the maintenance of assembly 


routines. 


Intel’s software environment dacludes the 
sophisticated source-level DOS DB-386 


- software debugger and execution environment, 


allowing 1486 software applications to be tested 


_ and debugged directly on a standard 386 


microprocessor-based PC. | 


To provide full access to the power of the i486 
architecture, the software portfolio 
incorporates a unique, sophisticated, and very 
powerful system builder, simplifying the 
generation of protected mode systems. To 
further reduce the task of integrating software 
into the final target configuration, Intel 1486 
microprocessor development tools produce 
code which can be directly downloaded target 
system ROM or converted into standard hex 
code. 
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| FEATURES | 


THE RIGHT TOOL FOR THE JOB 


The ICD-486/25, representing a new 
generation of in-circuit emulation technology, 
is the right tool to use when your product 
development schedules are tight and your 
product quality requirements are high. Intel’s 
exclusive technology allows you to debug 
cached applications at the full speed of the i486 
microprocessor, and the symbolic debug 
information can vastly improve your 
productivity. 


THE TOOL FOR THE FUTURE 


The ICD-486/25 was designed to be easily, and 
rapidly, convertible to support the newest 
speeds of the 1486 microprocessor. You can be 
assured that your investment in the ICD-486/ 
25 today will put you squarely on the upgrade 
path to higher speed components when they 
are made available. 


WORLDWIDE, WORLD CLASS 
SERVICES 


Augmenting Intel i486 microprocessor 
development tools is a full array of seminars, 
classes, and workshops; on-site consulting 
services; field application engineering 
expertise; telephone hotline support; and 
software and hardware maintenance contracts. 


8-25 


HOST SYSTEM REQUIREMENTS 


The user supplied host system can be an IBM® 

PC/AT® or Personal System/2® Model 60, or 

Model 80 or fully compatible system. Host. 

system requirements to run the in-circuit 

debugger include the following: | 

e DOS version 3.2 or later 

e 640K bytes of RAM in sonventional memory 

e An Above™ board with 1 megabyte of RAM 
configured in expanded memory mode, 
EMM.SYS version 3.2, or later, or 

¢ One megabyte of RAM configured as 
expanded memory using QEMM.SYS or 
386MAX : | 

e A hard disk with 2 ie loa of free space | | 280872-4 

e A serial port 


Serial Cable 


Figure 3: ICD with Logic Analyzer Interface 
. (LAD) board installed | 


ELECTRICAL 
CHARACTERISTICS 


100-120V or 220-240V selectable 
50-60 Hz 

2 amps (AC max) @ 120V 

1 amp (AC max) @ 240V 


ENVIRONMENTAL 
CHARACTERISTICS 


Operating Temperature: + 10° C to + 40° C 
(50 to 104° F) | 
Operating Humidity: Maximum of 85% 
relative humidity, 
non-condensing 


ELECTRICAL SPECIFICATIONS 


The synchronization input line must be valid 
for at least two CLK cycles. The 
synchronization input and output signals are 
standard TTL input and outputs. 


Serial Cable 


280872-2 


280872~3 


Figure 2: ICB with Optional Isolation Board 
(OIB) installed 
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ICD-486/25 SPECIFICATIONS AND REQUIREMENTS 


ICD-486/25 INTERFACE 
CONSIDERATIONS 


With the ICB directly attached to the target 
system without using the optional isolation 
board, the target system must meet the 
following requirements: 

e The bus controller must only enable data 
transceivers onto the bus during valid read 
cycles of the 486 CPU. 

READY # cannot be used with BREQ to 
terminate outstanding bus requests. (i.e., 
when using the ICD-486/25, BREQ will be 
asserted when there is not a corresponding 
assertion of ADS #). 

Before another bus master drives the local 
processor address bus, the other bus master 
must gain access to the address bus through 
the use of HOLD HLDA, AHOLD or BOFF #. 
The user system must be able to drive one 
additional CMOS load (approximately 25pF) 
on all signals that go to the emulation 
processor. 


If the target system does not satisfy these 
restrictions, the optional isolation board 
should be used to isolate the emulation 
processor from the target system. To guarantee 
proper operation with the optional ee 


A2-A31 valid delay 


A2-A31 valid delay 


BLAST # valid delay 
DO-D31 write data valid delay 
A4-A31, DO-D31 input set-up time 


eal Fen is a 


BEO-3#, M/IO#, W/R#, ADS#, HLDA valid delay} t6 Min+ 2.5ns | t6 Max+ 8ns | 


BEO-3#, M/IO#, W/R#, ADS# valid delay 


board, the clock period should be increased by 
the round trip buffer delay (10ns) unless the 
target system design already has enough 
timing margin. 


The processor module derives its DC power 
from the target system through the i486 CPU 
socket. It requires 1300mA, including the i486 
microprocessor current. The optional isolation 
board requires an additional 500mA. 


The processor must be socketed. The printed 
circuit board design should locate the processor 
socket at the physical ends of the printed 
circuit board traces that connect the processor 
to the other logic of the target system. This 
reduces transmission line noise. Additionally, 
if the target system is enclosed in a box, pin 
one of the processor socket should be oriented 
to make connecting the hinge cable easier. The 
ICD-486/25 hinge cable adds an additional 
10pF of capacitive loading and approximately 
0.5ns of propagation delay to each i486 CPU 
signal. 


Pins specified as N.C. in the i486 CPU pin 
description must be left unconnected. 
Connection of any of these pins to power, 
ground or any other signal may cause the 
processor or the ICD-486/25 debugger to 
malfunction. 


t6 Min+ 1.5ns | t6 Max + 5ns 


t22 Min + 5ns 


Note 1: Use these specifications for any bus cycle that begins on the same clock that HLDA is de-asserted. 
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ICD-486/25 SPECIFICATIONS AND REQUIREMENTS 


DC SPECIFI CATIONS WITH OPERATION ISOLATION BOARD INSTALLED 


~ Description 
Output High Voltage 


A2-A31,D0-D31 
BEO-3#, M/IO#¥, 
_W/R#, ADS#, BLAST # 
-HLDA ; 


Output Low Voltage 


~A2-A31,D0-D31 
BEO-3 #, M/IO#, 
W/R#, ADS#, BLAST # 
HLDA 
Input Leakage Current 


| A2-A31, DO-D31 
| Input High Current 
_ CLK, RESET, RDY#, BRDY ¥, BOFF #, 
AHOLD 
Input Low Current 
CLK, RESET, RDY #, BRDY#, BOFF#, 
AHOLD 


Note 1: These specifications are for the OIB only and do not include any processor module loading. 


ORDERING INFORMATION 


Order Code Description 
In-circuit debugger for the | 


ICD48625D 


ICD486CON33D 


1486 microprocessor. 


Operates to 25MHz. 


Includes hardware debug 
module, power supply, 


- isolation board, stand alone 


self-test chassis, flexible 
hinge cable, socket 


_ accessory assortment, user 


documentation, DOS host 
software and interface 
cable. 


Identical to the ICD48625D, 


but includes a prepaid 
upgrade to 33 MHz i486 
microprocessor support 
when available. 


~ (oy = 15 mA) 
oH = 
JoL= 
(ou = 3.2 mA) 


3 mA) 
3 mA) 


(Io, = 64 mA) 
(low = 64 mA) 
(Io, = 64 mA) 
(lo, = 24 mA) 


ICD486CON33DS Identical to the 


ICD486LAI 


ICD48625DS 


486HNGCBLA 


ICD486ACC 
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ICD486CON383D, but 


- includes an additional 12 | 


months of hardware and 
software maintenance and 
support. — | 
Additional Logic Analyzer 
Interface board. 

Identical to ICD48625D, but 
includes an additional 12 
months of hardware and - 
software maintenance and 
support. 

Additional flexible, hinge 
cable assembly. 
Additional ICD-486 socket 
accessory assortment and 
board separator. 
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ACCURATE AND SOPHISTICATED EMULATION FOR THE 
INTEL386™ FAMILY OF MICROPROCESSORS 


Intel386T™ In-circuit Emulators are the cornerstone of the optimum development 
solution for the Intel386 Family of microprocessors. From the inventor of the 
microprocessor comes a development tool that delivers absolute access to the 
sophistication of the architecture in a way that only Intel can. 


Productivity boosting features such as symbolic debugging make Intel386 emulators easy 
to use and powerful. Intel product quality and world class technical support and service 
minimizes the “downtime” incurred in resolving problems. And your investment in 
development tools is protected via interchangeable probes for the 80386 DX, 80386 SX, 
and 80376 microprocessors. 


Maximize your productivity with Intel development tools. Reduced time to market and 
increased market acceptance for your microprocessor-based product are the benefits 
when Intel is the choice. 


*HP9000 is a trademark of Hewlett Packard. 
ICE, iPAT, Above Board, Intel386, 386 DX, 386 SX, and 376 are trademarks of Intel Corporation. 
IBM, PC AT, PS/2 are registered trademarks of International Business Machines Corporation. 
GPIB-PCII, GPIB-PCIIA, and MC-GPIB are trademarks of National Instruments Corporation. 


November 1990 
‘8-29 Order Number: 280850-004 


INTEL ICE™.386 FAMILY IN-CIRCUIT EMULATOR FEATURES 


e¢ Unparalleled support of all of the Intel386 - 


operating modes opens the door to the full a, 


potential of the Intel386 architecture 


e Non-intrusive (zero wait-state) emulation to 


processor speeds of 33 MHz. ~~ 

¢ Versatile event recognition makes short 
work of uncovering complex bugs © 

e Dynamic trace display of bus and execution 
information during emulation | 


eA comprehensive software development. 
system creates the most complete : 
- development. environment available from a 


. single vendor | 
—¢ A companion ‘perforthance shales tool” 


provides analysis of software for optimized 
performance and reliability . 7 

¢ Available on. Hewlett-Packard HP9000 
oe IX*. | 
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[ FEATURES | | _ 


100% ACCURATE EMULATION 


Intel3886 Family In-circuit Emulators embody 
technology that accesses internal processor 
states that are otherwise invisible. Intel386 
microprocessors fetch and execute instructions 
in parallel; fetched instructions are not 
necessarily executed in any order. Because of 
this, an emulator without this capability is 
prone to error in determining what actually 
occurs inside the microprocessor. With Intel’s 
technology, an Intel386 In-circuit Emulator 
displays execution history with one hundred 
percent accuracy and in real-time. 


OPENING THE DOOR TO 
PROTECTED MODE : 


The Intel386 family of In-circuit Emulators 
opens the door to the full potential of the 
architecture with unparalleled support of 
protected mode. Not only does the emulator 
display and modify task state segments and 
global, local, and interrupt descriptor tables 
(with symbolic access to all descriptor 
components like privilege level and segment 
type), but emulator functions are sensitive to 
the operating mode of the processor, greatly 
improving ease of use. 


The Intel386 family of In-circuit Emulators 
supports all aspects of protected mode 
addressing, including paged virtual memory. 
Processor tables are used to automatically 
translate virtual addresses to linear and 
physical addresses. Physical addresses can be 
translated to symbolic references to indicate 
the module, procedure, or data segment 
accessed. And when debugging a memory 
management system, components of the page 
table and directory can be displayed and 
modified. 


FLEXIBLE AND VERSATILE 
EVENT RECOGNITION 


Flexibility and versatility in event recognition 
makes short work of uncovering the most 
complex bugs. Bus event recognition circuitry 
may be used to trigger on specific or masked, 
data input, output, read, written, or fetched at 
a physical address or range of addresses. Or on- 
chip debug registers may be used to trigger on 
virtual, linear, or symbolic addresses being 
executed, accessed, or written. 


Versatility shows in other triggering options— 
upon a task switch, an external signal from 
another emulator or a logic analyzer, multiple 
occurrences of an event, a full trace buffer, 
halt or shutdown cycles, or interrupt 
acknowledge. And up to four sequential event 
triggers can be combined with a high-level _ 
construct. 


The Intel386 family of In-circuit Emulators 
continuously captures all bus activity and, as 
an option, execution information, into a trace 
buffer of 4 K frames with PRE, POST, and 
CENTERED collection modes. The contents of 
the trace buffer can be displayed during full 
speed.emulation in either execution cycle or 
machine-level instruction formats. Symbolic 
information can optionally be included in the 
trace display. A third trace display, the current 
chain of procedure calls, can be displayed when 
emulating high-level language programs. 


SPEEDING DEVEL ae fh 
WITH SYMBOLICS 


Intel386 processor data semaceune such : as 
registers, descriptor tables, and page tables, 
can be examined and modified using symbolic 
names. And with the symbolic debugging 
information that is a feature of Intel 
languages, memory locations can be accessed 
using symbolic references to the source 
program (such as a procedure and variable 
names, line numbers, or program labels) 
rather than via cumbersome virtual, linear, or 
physical addresses. The type information of 
variables (such as byte, word, record, or array) 
can also be displayed. 


ACCESSING THE POWER 


The power of the Intel386 In-circuit Emulator 
is reflected in the sophisticated user interface. 
Refined for ease-of-use, the command line 
interface contains many features to boost 
productivity and customize functionality. 


On-line help, a syntax menu, command line 
editing, command history, and error message 
query promote ease of learning and use. I/O 
redirection and the ability to escape to the host 
operating system provide versatility for the 
power user. Customized procedures with 
variables and literal definitions can be created 
to assist in debugging or for manufacturing 
test or field service applications. 
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FEATURES 


ADDITIONAL FEATURES 


The Intel386 In-circuit Emulator can be 
combined with a variety of devices. I/O lines _ 
synchronize emulation starts and triggers with 
external tools such as a logic analyzer or 
another emulator. An optional Time Tag 
Board synchronizes multiple Intel386 
emulators and records timestamp information 
in the trace buffer with 20 nanosecond. | 
resolution. An Optional Clips Pod allows 8 user 
defined data lines to be captured and displayed 
in the trace. The bus isolation board buffers 
the emulation processor from faults in an 
untested target. And with the Stand-Alone 
Self-Test board the emulator can be used to 
debug software before the target system is _ 
functional, as well as execute confidence tests. 


THE INVESTMENT PICTURE | 


As designs move from one Intel386 Family 
processor to another, the reinvestment cost is 
limited to probes that adapt the emulator base 
to the specific processor. Beside cost savings, . 
migration from one processor to another is _ 
accomplished with minimum disruption in the 
engineering environment, as the same 
command language applies to the entire 
emulator family. 
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| FEATURES | 


SOFTWARE COMPLETES THE 
SYSTEM 


Intel wraps a comprehensive software 
development system around the emulator to 
deliver the most complete development 
environment available from a single vendor. 
Like the emulator, Intel’s software 
development system supports every aspect of 
the Intel386 architecture. | 


Overlooked at times is the fact that a 
significant part of developing a system is 
making sure the code works. Intel languages 
and software debugger integrate seamlessly 
with the Intel386 emulator and provide the 
symbolics so important for efficient debugging. 
By using Intel software tools with the Intel386 
emulator the full power of Intel development 
solution can be utilized. 


The software development system offers a 
broad choice of languages with object code 
compatibility so performance can be 
maximized by using different languages for 
specialized, performance critical modules. 
Architectural extensions in the high-level 
languages allows hardware features such as 
interrupts, input/output, or flags to be 
controlled directly, avoiding the tediousness of 
coding assembly language routines. 


ICET™-386 Dx 33 MHz SPECIFICATIONS AND 
| REQUIREMENTS | 


HOST SYSTEM REQUIREMENTS 


Intel’s software portfolio includes a unique, 
sophisticated, and very powerful system 
builder, simplifying the generation of 
protected mode systems. To further reduce the 
effort necessary to integrate software into the 
final target configuration, Intel tools produce 
ROM-able code directly from the development 
system. 


OPTIMIZING PERFORMANCE 
AND RELIABILITY 


A companion performance analysis tool, 
iPATTM-386, provides analysis of real-time 
software executing on 80386-based target 
systems. With iPAT-386, it is possible to speed- 
tune applications, optimize use of operating 
systems, determine response characteristics, 
and identify code execution coverage. And 
iPAT-386 can be used in conjunction with an — 
Intel386 in-circuit emulator to contro! test 
conditions. 


WORLD CLASS, WORLDWIDE 
SERVICES 


Augmenting the Intel386 Family asseioomient 
tools is a full array of seminars, classes and 
workshops; on-site consulting services; field 
application engineering expertise; telephone 
hotline support; and software and hardware 
maintenance contracts. 


The user supplied host system can be either an IBM® PC AT® or Personal System/2® Model 60. 
Host system requirements to run the emulator include the following: 


e DOS version 4.0 or Hewlett Packard HP9000 
UNIX 

¢ 640 Kbytes of RAM in conventional memory 

e An Above™ board with 1 megabyte of RAM 
configured in expanded memory mode, 
EMM.SYS software version 3.2 

¢ A 20 MB hard disk 


ELECTRICAL 
CHARACTERISTICS | 


100-120V or 220-240V selectable 
50-60 Hz 

2 amps (AC max) @ 120V 

1 amp (AC max). @ 240V 


e A serial port or the National Instruments 
GPIB-PCIITM, GPIB-PCIIA™, or MC- 
GPIB™ board 

e A math coprocessor if either the optional 
time tag board is used or if a math 
coprocessor resides on the target system | 


ENVIRONMENTAL 
CHARACTERISTICS 


Operating temperature: +10°Cto + 40°C 
(50°F to 104°F) 
Maximum of 85% 
relative humidity, 
non-condensing 


Operating Humidity: 
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ICE™-386 DX 33 MHz SPECIFICATIONS AND 
| REQUIREMENTS ~ 


The Emulator’ s Physical Chabasteristion 


Base Unit 
Processor Module 
Optional Isolation Board 
Power Supply — 
User Cable 
Target-Adapter Cable 
Serial Cable 

| Optional Clips Pod 


280850~2 
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ICETM-386 DX 33 MHz SPECIFICATIONS AND 
REQUIREMENTS 


The Processor Module and Bus Isolation Board Dimensions 


Se 85" 


ELECTRICAL SPECIFICATIONS _ output lines are driven by TTL open collector 
outputs that have 4.7K-ohm pull-up resistors. 
The synchronization input and output signals 
on the optional clips pod are standard TTL 
input and outputs. 


280850-3 


The synchronization input lines must be valid 
for at least four CLK2 cycles as they are only 
sampled on every other cycle. These input lines 
are standard TTL inputs. The synchronization | 


AC Specifications With the Bus Isolation Board Installed 


frmba[ Parner [nam [eso | Rt 


CLK2 period 40 nS 

CLK2 high time ~ «| t2a Min+2 nS @ 2V 

CLK2 low time t3b Min +2 nS @ 0.8v 
A2-A3]1 valid delay t6 Min+3.5nS |t6Max+24.6nS |CL=120 pF 
A2-A31 float delay t14Min+5.5nS |t14 Max+ 32.6 nS 
BEO#-BE3#, LOCK # valid delay t8 Min+3.5nS | t8 Max+ 24.6 CL=75pF 
BEO# —-BE3#, LOCK # float delay t14 Min+5.5 nS |t14 Max+ 32.6 | 

W/R#, M/I0O#, D/C#, ADS# valid delay | t10 Min+ 3.5 nS | t10 Min + 24.6 CL=75 pF 
W/R#, M/IO#, D/C#, ADS# float delay |t14 Min+ 5.5 nS |t14 Max+ 32.6 

DO0-D31 write data valid delay t12 Min+ 4.5 nS |t12 Max+ 20.6 CL= 120 pF 


DO-D31 write data float delay 7.5 nS 41.6nS | 
HLDA valid delay t14 Min=3nS t14 Max+21.2nS 
NA# hold time }t16 Min+ 10.6 nS 

BS16# hold time t18 Min+ 10.6 nS 

READY # hold time t20 Min+ 10.6 nS 


DO-D31 read setup time t21 Min+ 8.5 nS 
DO-D31 read hold time _ t22 Min+ 7.6 nS 
HOLD hold time . 1t24 Min+ 10.6 nS 
RESET setup time t25 Min+ 2.1 nS 
RESET hold time t26 Min+ 2.1 nS 
NMI, INTR hold time . . ' 1t28 Min+ 10.6 nS 
PEREQ, ERROR#, BUSY # hold time t30 Min + 10.6 nS 
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| SPECIFICATIONS | 


Emulator Capacitance Specifications 


TAC 
—— Peacisden Typical Installed 


Input Capacitance 
CLK2 
READY #, NMI, BS16# 
HOLD, BUSY #, PEREQ, 
NA#, INTR, ERROR# 
RESET 
Output or I/O Capacitance 
DO-D31 
A2-A31, BEO# —-BE3 # 
D/C# 
W/R# 
ADS#, M/IO#, LOCK #, 
HLDA | \ 


Note 1: Not tested. These specifications include the 80386 component and a additional 
emulator loading. ; 
Note 2: The target-adapter cable adds a propagation delay of 0.5 nS, 


Emulator DC Specifi cations : 
Without the Bus Isolation Board Installed 


[item [Description [Max [ Notes 


es Processor Module Supply Current | 386-Io¢ + 
15A 


Input High Leakage Current 
A2-A31, BEO#-—BE3#, DO-D31 
HLDA, NMI, BS16# — 

ADS#, M/IO#, LOCK#, READY # 


| 20nA 
10pA 
10uA 


W/R#, D/C# . » | 80uA | 
CLK2 | ‘15pA 
RESET | | 5pA 


Input Low Leakage Current 
A2-A31, BEO# -BE3#, DO-D31 
HLDA, NMI, BS16# © 
ADS#, M/IO#, LOCK#, READY # 


600nWA 
10pA - 
10pA 


W/R# 110yA 
D/C# 610nA 
CLK2 15pA 


RESET Bu A 


Note 1: This specification is the DC input loading of the emulator circuitry only and core 
not include any 80386 leakage current. 
Note 2: This specification replaces the 80386 specification for this signal. 
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SPECIFICATIONS ee 


Emulator DC Specifications With the Bus Isolation Board Installed 


Description Max. 
BIB-Icc | Bus Isolation Board Supply Current PM-Icc + | 
475 mA 
VOL | Output Low Voltage (Ip;, = 48 mA) | 
A2-A31, BEO# -BE3#, D/C#, ADS# 0.5v 
DO-D31, M/IO#, LOCK#, W/R# 0.5 v 
HLDA (Ip; = 24 mA) 0.44 v 
VoH Output High Voltage (Io = 3 mA) | 
A2-A31, BEO# -BE3#, D/C#, ADS# 
DO-D31, M/IO#, LOCK #, W/R# 
HLDA (Ip = 24 mA) 
lig Input High Current 
CLK2, RESET 1.0 pA 
READY # 25 wA 
Iq, Input Low Current 
CLK2, RESET 1.0 pA 
READY # | 250 pA 
Ijo ~3—séd|:« Output Leakage Current 
| A2-A31, BEO# -BE3#, D/C#, ADS# | +20 pA 
DO-D31, M/IO0#, LOCK #, W/R# +20 pA 
PROCESSOR MODULE The processor module derives its DC power 


INTERFACE CONSIDERATIONS 


With the processor module directly attached to 
the target system without using the bus 
isolation board, the target system must meet 
the following requirements: 

e The user bus controller must only drive the 
data bus during a valid read cycle of the 
emulator processor or while the emulator 
processor is in a hold state (the emulator 
processor uses the data bus to communicate 
with the emulator hardware). 

Before driving the address bus, the user 
system must gain control by asserting HOLD 
and receiving HLDA. 

The user reset signal is disabled during the 
interrogation mode. It is enabled in 
emulation, but is delayed by 2 or 4 CLK2 
cycles. | 
The user system must be able to drive one 
additional TTL load on all signals that go to 
the emulation processor. 


When the target system does not satisfy the 
first two restrictions, the bus isolation board is 
used to isolate the emulation processor from 
the target system. With the isolation board 
installed, the processor CLK2 is restricted to 
running at 25 MHz. 


from the target system through the 80386 
socket. It requires 1500mA, including the 
80386 current. The isolation board requires an 
additional 475mA. 


The processor must be socketed. The printed 
circuit board design should locate the processor 
socket at the physical ends of the printed 
circuit board traces that connect the processor 
to the other logic of the target system. This 
reduces transmission line noise. Additionally, 
if the target system is enclosed in a box, pin 
one of the processor socket should be oriented 
to make connecting the processor module or 
target-adapter cable (TAC) easier. 


The emulator uses the 386 microprocessor’s 

pins C7, E13, and F13. The 80386 High 

Performance 32-Bit Microprocessor With 

Integrated Memory Management data sheet 

specifies these pins as ‘““N/C” (no connect). If 

the target system uses any of these pins, you 

must do one of the following: 

¢ Use the bus isolation board. 

e Use the target-adapter cable (TAC). 

e Build an adapter to disconnect pins C7, E13, 
and F'13 (i.e., a socket with these pins 
removed). 
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ICE™-386 DX 25 MHz SPECIFICATIONS AND 
REQUIREMENTS 


HOST SYSTEM REQUIREMEN TS 


The user supplied host system can be either an IBM PC AT or Personal System/2 Model 60. Host 
system requirements to run the emulator include the following: 


¢ DOS version 4.0, or Hewlett-Packard HP ¢ A serial port or the National Instruments 
9000 UNIX GPIB-PCII, GPIB-PCIIA, or MC-GPIB board 

¢ 640 Kbytes of RAM in spiventional memory e A math coprocessor if either the optional 

¢ An Above board with 1 megabyte of RAM time tag board is used or if a math 
configured in expanded memory mode, _ coprocessor resides on the target system 


EMM.SYS software version 3.2 
© A 20 MB hard disk 


ELECTRICAL | _ ENVIRONMENTAL _ 
CHARACTERISTICS CHARACTERISTICS 

100-120V or 220-240V selectable | Operating temperature: +10° to +40°C 
50-60 Hz | (50° to 104° F) 


2 amps (AC max) @ 120V 


1 amp (AC max) @ 240V Operating Humidity: Maximum of 85% 


relative humidity, 
non-condensing 
The Processor Module and Bus Isolation Board Dimensions 
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ICETM-386 DX 25 MHz SPECIFICATIONS AND 
| REQUIREMENTS 


The Processor Module and Bus Isolation Board Dimensions 


280850~-5 


The Emulator’s Physical Characteristics 


Base Unit 
Processor Module 


Optional Isolation Board 
Power Supply - 

User Cable 
Target-Adapter Cable 
Serial Cable 

Optional Clips Pod 


ELECTRICAL SPECIFICATIONS output lines are driven by TTL open collector — 
outputs that have 4.7K-ohm pull-up resistors. 
The synchronization input and output signals 
on the optional clips pod are standard TTL 
input and outputs. 


The synchronization input lines must be valid 
for at least four CLK2 cycles as they are only 
sampled on every other cycle. These input lines 
are standard TTL inputs. The synchronization 
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ICE™-386 DX 25 MHz SPECIFICATIONS AND 
- REQUIREMENTS 


AC Specifications With the Bus Isolation Board Installed _ 


CLK2 period 50 nS 
CLK2 high time - |t2aMin+2nS : @ 2V 
CLK2lowtime — t3b Min+ 2 nS @ 0.8v 
A2-A31 valid delay | /t6Min+3.5nS |t6Max+246nS |CL=120 pF 
A2-A31 float delay t14 Min+5.5 nS |t14 Max+ 37.6 nS 
BEO# —BE3#, LOCK # valid delay t8 Min+3.5nS /|t8 Max+ 24.6 CL=75pF | 
BEO# -BE3#, LOCK # float delay t14 Min+5.5 nS |t14 Max+.32.6 
W/R#, M/IO#, D/C#, ADS¥ valid delay |t10 Min+3.5nS |t10Min+246 |CL=75 pF 
W/R#, M/IO#, D/C#, ADS# float delay |t14 Min+5.5nS |t14 Max+ 32.6 | 
DO-D31 write data valid delay |t12Min+4.5nS |t12Max+20.6 |CL=120pF 
DO0-D31 write data float delay 7.5nS 141.6nS . 
HLDA valid delay . t14 Min=3nS t14 Max+ 21.2 nS 
NA #¥ hold time | _ |t16 Min+ 10.6 nS | 
BS16# hold time a t18 Min + 10.6 nS. 
READY # hold time | t20 Min+ 10.6 nS 
DO-D31 read setup time t21 Min+ 8.5 nS 
DO-D31 read hold time : t22 Min+ 7.6 nS 
HOLD holdtime ~~ t24 Min + 10.6 nS 
RESET setup time t25 Min+ 2.1 nS 
RESET hold time t26 Min+ 2.1 nS 
NMI, INTR hold time | t28 Min+ 10.6 nS 
| PEREQ, ERROR#, BUSY# holdtime  |t30Min+10.6nS 


| SPECIFICATIONS | 


Emulator Capacitance Specifications 


TAC 
Srmba Description Typical Installed 


Input Capacitance 
CLK2 
READY #, NMI, BS16# | 
HOLD, BUSY #, PEREQ, NA#, INTR, ERROR # 
RESET 


Output or I/O Capacitance 
DO-D31 


A2-A31, BEO# -BE3# 
D/C# 

W/R# 

ADS#, M/IO#, LOCK#, 
HLDA 


Note 1: Not tested. These specifications include the 80386 component and all additional emulator loading. 
Note 2: The target-adapter cable adds a propagation delay of 0.5 nS. 
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| SPECIFICATIONS | | | 


Emulator DC Specifications Without the Bus Isolation Board Installed 


item | Deseviption _——*(| Max | Notes 


PM-Icc | Processor Module Supply Current 386SX-Icc + 
15A 


Input High Leakage Current 


A2-A31, BEO#-BE3#,D0-D31 | 20uA 
HLDA, NMI, BS16# 10uA 
ADS#, M/IO#, LOCK#, READY # | 10pA 
W/R#, D/C# 30nA 
CLK2 15yA 
RESET 5pA 


Input Low Leakage Current 


A2-A31, BEO#-BE3#,D0-D31 = | 600ynA 
HLDA, NMI, BS16# 10nA 
ADS#, M/IO#, LOCK #, READY# | 10ynA 
W/R#110 pA 
D/C# 610uA 
CLK2 > 15pA 
RESET 5A 


Note 1: This specification is the DC input loading of the emulator circuitry only and does not 
include any 80386 leakage current. 
Note 2: This specification replaces the 80386 specification for this signal. 


Emulator DC Specifications With the Bus Isolation Board Installed 


| tem | Description | Min. | Max. 


BIB-Icc | Bus Isolation Board Supply Current 


Output Low Voltage (Ip, = 48 mA) 
A2-A31, BEO# -BE3#, D/C#, ADS# 
DO-—D31, M/IO#, LOCK #, W/R# 
HLDA (Ip, = 24 mA) 

Output High Voltage (Io9y = 3 mA) 
A2-A31, BEO# -BE3#, D/C#, ADS# 
DO-—D31, M/IO#, LOCK #, W/R# 
HLDA (I9yH = 24 mA) | 

Input High Current 
CLK2, RESET 
READY # 

Input Low Current 
CLK2, RESET 
READY # 

Output Leakage Current | 

A2-A31, BEO# -BE3#, D/C#, ADS# 

DO-D31, M/IO#, LOCK #, W/R# +20 pA 
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PROCESSOR MODULE 
INTERFACE CONSIDERATIONS 


With the processor module directly attached to 
the target system without using the bus  _ 
isolation board, the target system must meet 
the following requirements: 

e The user bus controller must only drive the 
data bus during a valid read cycle of the 
emulator processor or while the emulator 
processor is in a hold state (the emulator 
processor uses the data bus to communicate 
with the emulator hardware). 

¢ Before driving the address bus, the user _ 

system must gain control by asserting HOLD 

and receiving HLDA. 

The user reset signal is disabled during the 

interrogation mode. It is enabled in 

emulation, but is delayed by 2 or 4 CLK2 
cycles. 

e The user system must be able to drive one 
additional TTL load on all signals that go to ~ 
the emulation processor. | 


When the target system does not satisfy the 
first two restrictions, the bus isolation board is 
used to isolate the emulation processor from 
the target system. With the isolation board 
installed, the processor CLK2 is restricted to 
running at 25 MHz. 


HOST SYSTEM REQUIREMENTS 


ICE™.-386 SX 20 MHz SPECIFICATIONS AND 
| REQUIREMENTS 


The processor module derives its DC power 


from the target system through the 80386 


socket. It requires 1500mA, including the 
80386 current. The isolation board requires an 
additional 475mA. 


The processor must be socketed. The printed 
circuit board design should locate the processor 
socket at the physical ends of the printed | 
circuit board traces that connect the processor 
to the other logic of the target system. This 
reduces transmission line noise. Additionally, 
if the target system is enclosed in a box, pin 
one of the processor socket should be oriented 
to make connecting the processor module or 
target-adapter cable (TAC) easier. 


The emulator uses the 386 microprocessor’s 

pins C7, E13, and F13. The 80386 High 

Performance 32-Bit Microprocessor With 

Integrated Memory Management data sheet 

specifies these pins as ““N/C” (no connect). If 

the target system uses any of these pins, you 

must do one of the following: _ 

e Use the bus isolation board. ° 

¢ Use the target-adapter cable (TAC). 

e Build an adapter to disconnect pins C7, E13, 
and F138 (i.e., a socket with these pins 
removed). | _ 


The user supplied host system can be either an IBM PC AT or Personal System/2 Model 60. Host 
system requirements to run the emulator include the following: 


e DOS version 4.0, or Hewlett-Packard HP9000 
UNIX | 

¢ 640 Kbytes of RAM in conventional memory 

e An Above board with 1 megabyte of RAM 
configured in expanded memory mode, 
EMM.SYS software version 3.2 

© A 20 MB hard disk 


ELECTRICAL 
CHARACTERISTICS | 


100-—120V or 220-—240V selectable 
50-60 Hz 

2 amps (AC max) @ 120V 

1 amp (AC max) @ 240V 


@ A serial port or the National Instruments 
GPIB-PCII, GPIB-PCIIA, or MC-GPIB board 

e A math coprocessor if either the optional 
time tag board is used or if a math 
coprocessor resides on the target system 


ENVIRONMENTAL 
CHARACTERISTICS 


Operating temperature: + 10°C to + 40°C 
| (50°F to 104°F) 


Maximum of 85% 
relative humidity, 
non-condensing 


Operating Humidity: 
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ICETM-386 SX 20 MHz SPECIFICATIONS AND 
REQUIREMENTS 


The Emulator’s Physical Characteristics 


Processor Module 
Optional Isolation Board 
Power Supply 
User Cable : 
Target-Adapter Cable 

~ Serial Cable 
Optional Clips Pod 
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‘ICET™.386 SX 20 MHz SPECIFICATIONS AND 


REQUIREMENTS 


ELECTRICAL SPECIFICATIONS 


The synchronization input lines must be valid 
for at least four CLK2 cycles as they are only 

_ sampled on every other cycle. These input lines’ 
are standard TTL inputs. The synchronization 


output lines are driven by TTL open collector 
outputs that have 4.7K-ohm pull-up resistors. 
The synchronization input and output signals 
on the optional clips pod are standard TTL 
input and outputs. 


AC Specifications With the Bus Isolation Board Installed — 


CLK2 period 

CLK2 high time 

CLK2 low time 

A1-A28 valid delay 

A1-A23 float delay 

BLE#, BHE# LOCK # valid delay 
BLE#, BHE# LOCK # float delay 
W/R#, M/IO#, D/C#, ADS# valid delay 


W/R#, M/IO#, D/C#, ADS# float delay 
D0-D15 write data valid delay. 
DO-D15 write data float delay © 

_| HLDA valid delay 

NA# hold time 

| READY # hold time 

DO-D15 read setup time 

| DO-D15 read hold time 

HOLD hold time. 

RESET setup time 

RESET hold time 

NMI, INTR hold time 

_.| PEREQ, ERROR #, BUSY # hold time 


se (rene [ima [aime [i 


t8 Min+3.5nS' |t8 Max+ 24.6 CL=75pF 
-|t14Min+5.5nS |t14Max+37.6 a 
t10 Min+3.5nS |t10Min+246 |CL=75pF 
t14 Min+5.5nS |t14Max+37.6 — 
t12 Min+4.5nS |t12Max+20.6 |CL=120 pF 
75nS 45.6nS — ie 


| t22 Min+ 7.6 nS 


{+26 Min+2.1nS 
'|t28 Min + 10.6 nS| 


50 nS 


t2a Min+2nS @ 2V. 

t38b Min+ 2 nS -. |@ 0.8v_ 

t6 Min+3.5nS {t6Max+24.6nS |CL=120 pF 
tl4 Min+5.5nS |tl4Max+376nS| _ 


t14 Min=3nS 
t16 Min+ 10.6 nS| 
t20 Min+ 10.6 nS |. 
t21 Min+8.5 nS 


t14 Max + 21.2 nS| 


t24 Min + 10.6 nS 
t25 Min+ 2.1 nS 


t30 Min + 10.6 nS 
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SPECIFICATIONS _ | 


Emulator Capacitance Specifications 
With the Target-Adapter Cable Installed 


Typical 
a Description (Note 1) 


Input Capacitance 
CLK2 
READY #, ERROR# 
HOLD, BUSY #, PEREQ, 
NA#, INTR, NMI 
RESET | 
Output or I/O Capacitance 
D15-D0 
A15-A1, BLE# 
A23-A16, BHE#, D/C# 
HLDA, W/R# 
ADS#, M/I0#, LOCK # 


Note 1: Not tested. These specifications include the 80386SX compo- 
nent and all additional emulator loading. 


Emulator DC Specifications 
Without the Bus Isolation Board Installed 


ie ee 


PM-Icc | Processor Module Supply Current 386SX-Iog + 
| . | 940 mA 


Input High Leakage Current, 
A23-A1, BLE#, BHE#, D/C#, HLDA | 0.02 mA 


_ D15-D0 0.06 mA 
_ ADS#, M/IO#, LOCK#, READY #, 

ERROR # 0.01 mA 
W/R# 0.03 mA 
CLK2 0.04 mA 
RESET 0.06 mA 


Input Low Leakage Current 
A23-A1, BLE#, BHE#, D/C# 


0.6 mA 


D15-D0O 0.06 mA 
ADS#, M/I0#, LOCK#, READY#, 

ERROR# 0.01 mA 
W/R# © 0.51 mA 
CLK2 0.62 mA 
RESET 0.6 mA 


HLDA 0.02 mA 


Note 1: This specification is the DC input loading of the emulator circuitry only and does not include 
any 80386SX leakage current. 
Note 2: This specification replaces the 80386SX specification for this signal. 
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SPECIFICATIONS 


Emulator DC Specifications With the Bus Isolation Board Installed - 


D15-D0, M/IO#, LOCK #4, W/R# 
HLDA (Ip, = 24 mA) 


D15-D0, M/IO#, LOCK #, W/R# 
HLDA (IpH = 24 mA) 
| Input High Current 
CLK2, RESET 
READY # 
Input Low Current 
CLK2, RESET 
READY # 
Output Leakage Current 


D15-D0, M/I0#, LOCK #, W/R# 


PROCESSOR MODULE | 
INTERFACE CONSIDERATI ONS 


With the processor module directly attached to 
the target system without using the bus 
isolation board, the target system must meet 
the following requirements: 

e The user bus controller must only drive the 

data bus during a valid read cycle of the 

emulator processor or while the emulator 
processor is in a hold state (the emulator 
processor uses the data bus to communicate | 
with the emulator hardware). » 

Before driving the address bus, the user 

system must gain control by asserting HOLD 

and receiving HLDA. 

e The user reset signal is disabled during the 
interrogation mode. It is enabled in 
emulation, but is delayed by 2 or 4 CLK2 
cycles. 

¢ The user system must be able to drive one 
additional TTL load on all signals that go to 
the emulation processor. 


When the target system does not satisfy the 
first two restrictions, the bus isolation board is 
used to isolate the emulation processor from 
the target system. With the isolation board 
installed, the processor CLK2 is restricted to 
running at 20 MHz. 
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item [Description «(Min | Max. 


BIB-Icc | Bus Isolation Board Supply Current 


Output Low Voltage (Io,,=48 mA) | 
A23-A1, BLE#, BHE#, D/C#, ADS# 


Output High Voltage (Ioq=3 mA) _ | 
A23-A1, BLE#, BHE#, D/C#, ADS# | 2.4v 


-A23-A1, BLE#¥, BHE#, D/C#, ADS# 


1.0 pA 
2.5 pA 


1.0 pA 
250 pA 


+20 uA 
+ +20 mA } 


The processor module derives its DC power 
from the target system through the 80386SX 


socket. It requires 1400mA, including the 


80386SX current. The isolation board requires 
an additional 350mA. 


The processor must be socketed, for example 
using Textool 2-0100-07243-000 or AMP 
821949-4 sockets. 


The printed circuit board design should locate 
the processor socket at the physical ends of the 
printed circuit board traces that connect the 
processor to the other logic of the target 
system. This reduces transmission line noise. 
Additionally, if the target system is enclosed in 
a box, pin one of the processor socket should be 
oriented away from the target system’s box 
opening to make connecting ee target-adapter 
cable easier. 


intel 
ICETM-376 SPECIFICATIONS AND REQUIREMENTS 


HOST SYSTEM REQUIREMENTS 


The user supplied host system can be either an IBM PC AT or Personal System/2 Model 60. Host 
system requirements to run the emulator include the following: 


¢ DOS version 4.0, or Hewlett-Packard HP9000 © Aserial port or the National Instruments 


UNIX GPIB-PCII, GPIB-PCIIA, or MC-GPIB board 
e 640 Kbytes of RAM in conventional memory e A math coprocessor if either the optional 
e An Above board with 1 megabyte of RAM time tag board is used or if a math 
configured in expanded memory mode, coprocessor resides on the target system 


EMM.SYS software version 3.2 
e A 20 MB hard disk 


ELECTRICAL ENVIRONMENTAL 
CHARACTERISTICS CHARACTERISTICS 

100-120V or 220-240V selectable Operating temperature: + 10°C to + 40°C 
50-60 Hz (50°F to 104°F) 


2 amps (AC max) @ 120V 


1 amp (AC max) @ 240V Operating Humidity: Maximum of 85% 


relative humidity, 
non-condensing 


The Emulator’s Physical Characteristics 


| Gen Width Height Length 
ni 


Base Unit 
Processor Module 
Optional Isolation Board 


Power Supply 

User Cable 

100-Pin Target-Adapter Cable 
88-Pin Target-Adapter Cable 
Serial Cable 

Optional Clips Pod 
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ICETM-376 SPECIFICATIONS AND REQUIREMENTS 


The Processor Module and Bus Isolation Board Dimensions (100 Pin PQFP) 
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ELECTRICAL SPECIFICATIONS output lines are driven by TTL open collector 

| outputs that have 4.7K-ohm pull-up resistors. 
The synchronization input and output signals 
on the optional clips pod are standard TTL 
input and outputs. 


The synchronization input lines must be valid 
for at least four CLK2 cycles as they are only 
sampled on every other cycle. These input lines 
are standard TTL inputs. The synchronization 


AC Specifications With the Bus Isolation Board Installed 


Sma — a oe 


| CLK2 period 50 nS t1 Max 
CLK2 high time t2a Min+2nS 
CLK2 low time t8b Min +2 nS 
A1-A28 valid delay t6 Min+ 3.5 nS 
A1-A23 float delay | t14 Min+5.5 nS 
BLE#, BHE# LOCK # valid delay t8 Min+3.5 nS 
BLE#, BHE# LOCK # float delay t14 Min+ 5.5 nS 
W/R#, M/IO#, D/C#, ADS# valid delay | t10 Min + 3.5 nS 
W/R#, M/IO#, D/C#, ADS# float delay | t14 Min+ 5.5 nS 
DO-D15 write data valid delay t12 Min+ 4.5 nS 
DO-D15 write data float delay 7.5 nS 

HLDA valid delay t14 Min=3nS 
NA# hold time t16 Min + 10.6 nS 
READY # hold time t20 Min + 10.6 nS 
‘| D0-D15 read setup time t21 Min+ 8.5 nS 

| DO-D15 read hold time t22 Min+ 7.6 nS 
HOLD hold time t24 Min + 10.6 nS 
RESET setup time t25 Min+ 2.1 nS 
RESET hold time t26 Min+ 2.1 nS 
NMI, INTR hold time 1t28 Min+ 10.6 nS 
PEREQ, ERROR #, BUSY # hold time t30 Min+ 10.6 nS 


t6 Max+ 24.6nS | 
t14 Max+ 37.6 nS 
t8 Max + 24.6 

t14 Max + 37.6 
1t10 Min + 24.6 

t14 Max + 37.6 
t12 Max + 20.6 
45.6 nS 

t14 Max+ 21.2nS 
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| SPECIFICATIONS | | 


Emulator Capacitance Specifications 
With Target-Adapter Cable Installed 


Typical 
Description | (Note 1) | 


Input Capacitance 
CLK2 
READY #, ERROR# | 
HOLD, BUSY #, PEREQ, NA#, | 
INTR, NMI 


RESET 

Output or I/O Capacitance 
D15-D0 
A15-A1, BLE# 
A23-A16, BHE#, D/C# 
HLDA, W/R# 
ADS#, M/IO#, LOCK # 


Note 1: Not tested. These specifications include the 80376 component and all 
additional emulator loading. 


Emulator DC Specifications 
Without the Bus Isolation Board Installed 


PM-Icc | Processor Module Supply Current 376-Ic¢¢ + 
| 940 mA 
Ip Input High Leakage Current | 
3 A23-Al1, BLE#, BHE#, D/C#, HLDA | 0.02 mA | 
D15-D0 0.06 mA 1 
ADS#, M/IO#, LOCK #, READY#, | 
ERROR # | 0.01 mA 1 
W/R# | 0.03 mA 1 
CLK2 | 0.04 mA 1 
RESET 0.06 mA 2 
In, Input Low Leakage Current 
A23-A1, BLE#, BHE#, D/C# | 0.6 mA 1 
D15-D0O 0.06 mA 1 
ADS#, M/IO#, LOCK #, READY #, } 
ERROR# | 0.01 mA - 1 
W/R# 0.51mA 1 
CLK2 | 0.62 mA 1 
RESET 0.6mA 2 
HLDA 0.02 mA 1 


Note 1: This specification is the DC input loading of the emulator circuitry only and does not 
include any 80376 leakage current. 
Note 2: This specification replaces the 80376 specification for this signal. 
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Emulator DC Specifications With the Bus Isolation Board Installed 


Output Low Voltage (Ip, = 48 mA) 


D15-D0, M/IO#, LOCK #, W/R# 
HLDA (Ip, = 24 mA) 
Output High Voltage Toy =3 mA) 


D15-D0, M/IO#, LOCK #, W/R# 
HLDA (Ip9H= 24 mA) 
Input High Current 
CLK2, RESET 
READY # 
Input Low Current 
CLK2, RESET 
| READY# 
| Output Leakage Current 


D15-D0, M/10#, LOCK #, W/R# 


PROCESSOR MODULE 
INTERFACE CONSIDERATIONS 


With the processor module directly attached to 
the target system without using the bus 
isolation board, the target system must meet 
the following requirements: 

e The user bus controller must only drive the 
data bus during a valid read cycle of the 
emulator processor or while the emulator 
processor is in a hold state (the emulator 
processor uses the data bus to communicate | 
with the emulator hardware). 

Before driving the address bus, the user 
system must gain control by asserting HOLD 
and receiving HLDA. 

The user reset signal is disabled during the 
interrogation mode. It is enabled in 
emulation, but is delayed by 2 or 4 CLK2 
cycles. 

The user system must be able to drive one 
additional TTL load on all signals that go to 
the emulation processor. 


When the target system does not satisfy the 
first two restrictions, the bus isolation board is 
used to isolate the emulation processor from 
the target system. With the isolation board 
installed, the processor CLK2 is restricted to 
running at 20 MHz. 


Pie [Deseription Pin [i 


BIB-Icc | Bus Isolation Board Supply Current 


A23-A1, BLE#, BHE#, D/C#, ADS# 


A23-Al, BLE#, BHE#, D/C#, ADS# | 2.4v 


A23-Al, BLE#, BHE#, D/C#, ADS# 


The processor module derives its DC power 
from the target system through the 80376 
socket. It requires 1400mA, including the 
80376 current. The isolation board requires an 
additional 350mA. 


The processor must be socketed, for example 
using Textool 2-0100-07243-000 or AMP 
821949-4 sockets. 


The printed circuit board design should locate 
the processor socket at the physical ends of the 
printed circuit board traces that connect the 
processor to the other logic of the target 
system. This reduces transmission line noise. _ 
Additionally, if the target system is enclosed in 
a box, pin one of the processor socket should be 
oriented away from the target system’s box 
opening to make connecting the target-adapter 
cable easier. 
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ORDERING INFORMATION 


IN-CIRCUIT EMULATORS 
ORDER CODES 


pICE376D 


1CE37616H 


ICE386SX20D 


ICE376 In-circuit Emulator for 
80376 component. Operates to 
16 MHz. Includes control unit, 
power supply, 376 Processor 
Module with PQFP adaptor, 
Stand-Alone Self-Test board, 
bus Isolation Board, and DOS 
3.x host software and interface 
cable. 


HP9000 hosted In-circuit 
Emulator for 80376 
component. Operates to 
16 MHz. 


ICE386SX 20 MHz In-circuit 


_ Emulator for 80386 SX 


ICE386SX20H 


component. Includes control 
unit, power supply, 386 SX 
Processor Module with PQFP 
adaptor, Stand-Alone Self-Test 
board, bus Isolation Board, and 
DOS 3.x host software and 
interface cable. 


HP9000 hosted In-circuit 
Emulator for 80386 SX 
component. Operates to 
20 MHz. 


pICE38625D 


ICE3886DX33D 


ICE386 25 MHz In-circuit 
Emulator for 80386 DX 
component. Includes control 
unit, power supply, 386 DX 
Processor Module with 132 pin 
PGA adaptor, Stand-Alone 
Self-Test board, bus Isolation 
Board, and DOS 3.x host 
software and interface cable. 


ICE386DX 33 MHz In-circuit 
Emulator for 80386 DX 
component. Includes control 
unit, power supply, 386 DX 
Processor Module with 132 pin 
PGA adaptor, Stand-Alone 
Self-Test board, bus Isolation 
Board, and DOS 3.x host 
software and interface cable. 


ICE386DX25H HP9000 hosted In-circuit 


Emulator for 80386 DX 
component. Operates to 
25 MHz. 


ICE386DX33H HP9000 hosted In-circuit 
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Emulator for the 386 DX 
component. This custom unit 
operates to 33 MHz. 


ORDERING INFORMATION 


IN-CIRCUIT EMULATOR 
CONVERSION KITS ORDER 


CODES © 
pTOICES86SX20D 


pICE376TO386D 


pICE386SXTO376D 


Conversion kit to adapt 
emulator base to support 
the 80386 SX component. 
Operates to 20 MHz. 
Includes ICE386SX 

20 MHz Processor Module 
and DOS 3.x host 
software. 


Conversion kit to adapt 
ICE376 emulator to 
support the 80386 DX 
component at 25 MHz. 


Includes ICE386 25 MHz 


Processor Module and 
DOS 3.x host software. 
Conversion kit to adapt 
ICE386SX 16 or 20 MHz 
emulator to support the 
80376 component. 


- Operates to 16 MHz. 


Includes ICE376 emulator 


- Processor Module and 


pICE386SXTO386D 


DOS 3.x host software. 


Conversion kit to adapt 
ICE386SX 16 or 20 MHz » 
emulator to support the 
80386 DX component at 
25 MHz. Includes ICE386 
25 MHz Processor Module 
and DOS 3.x host 
software. 


pICE386TO376D 


TOICE386DX33D 


Conversion kit to adapt 
ICE386 25 MHz emulator 
to support the 80376 
component. Operates to 
16 MHz. Includes ICE376 
emulator Processor 
Module and DOS 3.x host 


software. 


Conversion kit to adapt 
emulator base to support 
the 80386 DX 33 MHz 
component. Operates to 
33 MHz. Includes 
ICE386DX 33 MHz 


emulator Processor 


Module and DOS 3.x host 
software. 


IN-CIRCUIT EMULATOR 
OPTION ORDER CODES 


p88PGAADAPT 


pICE3XXCPO 


pICE3XXTTB 


DTOAB — 
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Adaptor for ICE376 
' emulator to support 


88 pin PGA component 
packaging. 


Clips Pod Option for 
ICE376, ICE386SX 16 or 
20 MHz, ICE386 25 MHz, 
and ICE386DX 33 MHz 
emulators. 


Time Tag Board Option 
for ICE376, ICE386SX 
16 or 20 MHz, ICE386 
25 MHz, and ICE386DX 
33 MHz emulators. 


2 MB Above Board. 


280894-1 


ACCURATE AND SOPHISTICATED EMULATION FOR THE 
INTEL i486T™ FAMILY OF MICROPROCESSORS 


The Intel ICET-486 In-Circuit Emulator is the world’s leading tool for debugging 
software and hardware designs based on the Intel i486T family of microprocessors. 
From the inventor of the microprocessor comes a development tool that allows you 
complete access and control over the sophisticated capabilities of the i486 microprocessor. 


The ICE-486 features real-time emulation of the i486 microprocessor at speeds up to 33 
MHz. Its standard high-level, symbolic debug capability saves valuable development 
time. The flexible breakpoint capability and 8K deep trace buffer provide the power to 
identify and solve the toughest hardware and software bugs. | 


Intel product quality and world-class technical support and service give you the time-to- 
market advantage in designing your i486 microprocessor based product. And your 
investment in development tools is protected via interchangeable probes for the 386™ 
family and i486 microprocessors. 


November 1990 
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FEATURES 


OVERVIEW 


e Intel technology to access and modify all 
internal processor registers including the 


i486 processor on-chip cache control registers .. 


and floating point registers © 


execution history to processor speeds of 33 
MHz 

Symbolic support saves time in referencing 
program objects while debugging 

Complete support for all processor 
addressing modes including real, protected, 
and virtual 8086 modes with support for 1486 
processor paging modes 

e 2MB expansion memory to debug jase 
programs 

Maximum flexibility in break-point 
specification cuts time to identify complex 
bugs 

Deep trace buffer with the ability to collect 
and display 8K frames of bus and/or | 
execution trace information . 


optimized for creating 32-bit applications | 
which utilize all the features of the 1486 
microprocessor 


BENEFITS OF 100% ACCURATE 


_ EXECUTION HISTORY 


The i486 microprocessor can simultaneously 
fetch and execute instructions. However, due 
to code branching, fetched instructions are not 
necessarily executed. Additionally, the i486 
can execute instructions from the on-chip - 
cache with no associated external bus activity. 
The ICE-486 emulator uses Intel technology to 
access the internal processor conditions that 
are not available to emulators which simply 
monitor the external buses for detection of | 
internal events. Emulators which do not have 
access to the internal processor conditions ° 
cannot guarantee accurate display of what — 
actually was executed by the microprocessor. 
With an Intel ICE-486, you can be certain that 
the emulator is displaying execution history 
with 100% accuracy, even when executing 
code from the on-chip cache memory. 


OPENING THE DOOR TO 
PROTECTED MODE 


Intel 1486 emulators support protected mode 
operation of the i486 microprocessor. The 
emulator can display and modify task state 


Non-intrusive, 100% accurate emulation and 


Comprehensive software a cloeiaaek tools _ - 


segments and global, local, and interrupt 
descriptor tables (with symbolic access to all 
descriptor components such as privilege level 
and segment type). Emulator functions are 


sensitive to the operating mode ofthe — 


processor, saving user setup time in debugging 


~ complex protected mode applications. 


Intel i486 emulators support all aspects of 


_ protected mode addressing, including paged 
_ virtual memory. You can automatically 
- translate virtual addresses to linear and 


physical addresses. Physical addresses can be 
translated to symbolic references to indicate 
the module, procedure, or data segment | 
accessed. When debugging a memory 
management system, components of the page 
tables and directory can be displayed and 
modified. 


FLEXIBLE EVENT 


RECOGNITION 


The emulator can be configured to break on a 
wide variety of events. Flexible event 
recognition saves time isolating and fixing the 
most complex bugs. The emulator can trigger 
breakpoints on a variety of bus events such as 
a specific or masked data input, output, read, 
write, or a fetch at a physical address or range 
of addresses. On-chip debug registers can be set 
to break on virtual, linear, or symbolic address. 
execution, access, or writes using oe cache 


_ memory or RAM. | 
_ There are several other aesering options: You 


can break on a task switch, an external signal 
from another emulator or logic analyzer, 
multiple occurrences of an event, full trace 
buffer, halt or shutdown cycles, or interrupt 


~ acknowledge cycle. And up to four sequential 
event triggers can be combined with a high- 
~ level construct. , 


8K TRACE BUFFER pee 
COLLECTING EXECUTION AND 
BUS ACTIVITY 


Intel i486 emulators can continuously capture 
all or selective bus activity, and/or optionally 
capture execution information. The trace 
buffer can store up to 8,192 frames with PRE, 
POST, and CENTERED collection modes. 
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With the emulator halted, the trace buffer 
contents can be displayed in bus cycle or 
execution instruction formats. Dynamic trace 
capability allows the contents of the trace 
buffer to be displayed as bus cycles during full- 
speed emulation, a benefit when the processor 
cannot be halted while debugging time-critical 
systems. Symbolic information can be included 
in the trace display. When debugging high- 
level language programs, the callstack can also 
be displayed to show the current chain of 
procedure calls. 


SYMBOLIC DEBUGGING SAVES 
DEVELOPMENT TIME 


With the symbolic debugging capability of 
Intel languages and the Intel ICE-486, all data. 
structures such as register, descriptor table, 
and page table contents can be examined and 
modified using symbolic names. Memory 
locations can be examined and modified using 
symbolic references to the source program 
(procedure and variable names, line numbers, 
or program labels). This eliminates the time- 
consuming use of virtual, linear, or physical 
address referencing used in other emulator 
systems. Variable type information (such as 
byte, word, record, or array) is also provided to 
make the debugging process faster and easier. 


ACCESSING THE POWER 


The Intel ICE-486 in-circuit emulator features 
a sophisticated command structure that allows 
you to easily access all the capabilities of the 
emulator. 


On-line help, a syntax guide, command line 
editing, command history, and detailed error 
messages promote ease of learning and use. I/ 
O redirection and the ability to escape to the 
host operating system increase versatility for 
the user with complex debugging needs. 
Creation of customized debug procedures with 
variables and literal definitions simplifies and 
automates debugging tasks used in design, test 
and evaluation, manufacturing test, or field 
service. 


SYSTEM CONNECTIVITY AND 
CONFIGURATION 


The Intel i486 emulator can be combined with 
a variety of lab instruments to extend the 
capability of the tool. I/O sync lines allow 

. emulator event control and synchronization 
with external tools such as logic analyzers, 
scopes, or another emulator. 


INCLUDED OPTIONAL USE 
EQUIPMENT 


The Relocatable Expansion Memory (REM) 
board included with the ICE-486 system allows 
you to map 2 MB of memory for developing 
large applications before prototype target 
system memory is completely functional. Also, 
an optional isolation board is provided with the 
emulator. It buffers signals to the emulator, 
protecting the emulator from potential damage 
caused by an untested prototype target system. 
The REM board can be used in conjunction 
with the isolation board to overlay EPROMs. 
This technique avoids the slow process of 
programming new EPROMs each time a new 
version of EPROM software is compiled. 


A stand-alone/self-test board is also provided 
with the emulator. The stand-alone/self-test 
board, in conjunction with the REM board, 
allows execution and debugging of code to 
begin before target system availability. It also 
allows execution of the emulator confidence 
tests so you always know with certainty that 
the emulator is functioning properly. 


EMULATOR OPTIONS | 


An optional time tag board synchronizes 
multiple Intel i486 emulators and adds 20- 
nanosecond resolution time stamp information 
to the trace buffer. 


An optional clips pod allows eight data lines to 
be captured in the trace buffer and displayed © 
in the CYCLES format. 


SOFTWARE COMPLETES THE 
SYSTEM 


Intel provides comprehensive software 
development tools which work together with 
the ICE-486 emulator for the most complete 
development environment available from a | 
single vendor. C, PL/M, and Fortran compilers 
are available in addition to a Macroassembler. . 
A builder and binder, available for configuring 
and linking software modules, greatly simplify 


- configuration of code modules for protected 


mode systems. The DB-386/486 source-level 
software debugger with its powerful windowed 
interface completes the picture. 


To further reduce the effort necessary to 
integrate software into the final target 
configuration, Intel tools produce ROM code 
directly, saving you the time and headaches 
frequently encountered using converter 
utilities from other vendors. 
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WORLD-CLASS, WORLDWIDE which are available from Intel to insure your 
SERVICES design requirements are met on time and with 
minimal problems. Only Intel gives you one 
source to call for complete development tool 
support for your i486 design. 


Intel i486 development tools are supplemented 
by a full array of support services. Seminars, 
classes and workshops, on-site consulting, field 
application engineering expertise, telephone 
hotline support, and software and hardware 
maintenance are just a few of the services 


ICETM-486 33 MHz SPECIFICATIONS AND 


REQUIREMENTS 


HOST SYSTEM REQUIREMENTS 


The user supplied host system can be either an IBM® PC/AT®, Personal erste 2® Model 60, or 
Model 80. Host system requirements to run the emulator include the following: 


e DOS version 3.3 | e A 20MB hard disk 
e 640K bytes of RAM in conventional memory e A serial port or the National Instruments 
e An Above™ board with 1 megabyteofRAM = GPIB-PCIITM, GPIB-PCIIA™, or MC- 


configured in expanded memory mode, GPIBT™ board 
EMM.SYS software version 3.2, or * @ A math coprocessor is required in the host 

e One megabyte of RAM configured as system if either the optional time tag board 
expanded memory using 386MAX : is used or if the on-chip floating point unit j is 


utilized by the target system 


ELECTRICAL ENVIRONMENTAL 


CHARACTERISTICS CHARACTERISTICS 

100-120V or 220-240V selectable | Operating temperature: +10°C to + 40°C 
50-60 Hz (50°F to 104°F) 

2 amps (AC max) @ 120V | + asa, 

1 amp (AC max) @ 240V Operating Humidity: | Maximum of 85% 


relative humidity, 
ep ene 


The Emulator’ s Physical Characteristics 


Unit ae aig eee 
inches inches inches 


Base Unit 
Processor Module 
Optional Isolation Board 


Power Supply | 
- User Cable 
Hinge Cable 
Relocatable Expansion Memory Board 
Serial Cable 
Optional Clips Pod 
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| ICETM-486 33 MHz SPECIFICATIONS AND 
REQUIREMENTS 


THE PROCESSOR MODULE AND OPTIONAL BOARD DIMENSIONS 


» 280894-5 
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ICETM-486 33 MHz SPECIFICATIONS AND 
REQUIREMENTS 7 


ELECTRICAL SPECIFICATIONS 


The synchronization input lines must be valid for at least two CLK cycles. These input lines are 
standard TTL inputs. The synchronization output lines are driven by TTL open collector outputs 
that have 4.7K-ohm pull-up resistors. The synchronization input and output signals on the 
optional clips pod are standard TTL input and outputs. The emulator delays the RESET signal to 
the i486 by a maximum of 8ns and the A2Z0M# and FLUSH # signals by a maximum of 5ns. 


Emulator AC Specifi cations with the Isolation Board Installed 


Parameter 


A2-A31 ene delay 


A2- A31 valid delay } 

BEO-3 #4, M/IO#, W/R#, ADS# # valid delay 

BLAST # valid delay 

DO-D31 write data valid delay 

RDY # setup time t16 Min + 5ns 

A4-A31, DO-D81 input set up time t22 Min+ 5ns | 
Note 1: Use these specifications for any bus cycle that begins on the same clock that HLDA is de-asserted. 


Emulator Capacitance saci hats cations 


_Hinge Cable Installed 


Input Capacitance: 
CLK 
A20M#, AHOLD, FLUSH #, HOLD, 
- IGNNE#, INTR 
BS8#, BS16#, EADS#, KEN # 
BRDY #, NMI, RDY # 
RESET, BOFF # 
Output or I/O Capacitance: 
DO - D381, BLAST #, D/C#, LOCK#, 
PLOCK # 
A2 - A31, ADS# 
HLDA, M/IO# 
PWT 
BEO# - BE3#, PCD 
BREQ#, PCHK # 
FERR# 
DPO - DP3 
W/R# 


Note 1:. Not tested. These specifications include the i486 component and all additional emulator loading. 
Note 2: The hinge cable adds a propagation delay of 0.5 ns. ~ 
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Emulator DC Specifications without the Isolation Board Installed 


ee ee a 


Processor Module Supply Current 4861. +1.5A 
Input High Leakage Current 


Do-31 15uA 1 
A2-31, BE#0-3, PWT 5uA 1 
W/R#, D/C#, M/IO#, LOCK #, PLOCK# 5uA 1 
BLAST #, HLDA 5uA 1 
BS16#, BS8#, EADS#, KEN#, NMI 5uA 1 
BOFF #, RDY #, BRDY # 25uA 1 
PCD © 30uA 1 
CLK 15uA 1 
RESET 30uA 2 
A20M#, FLUSH # 5uA 2 


Input Low Leakage Current 


DOo-31 15uA 1 
A2-31, BE #0-3, PWT 5uA 1 
W/R#, D/C#, M/IO#, LOCK #, PLOCK# 5uA 1 
BLAST #, HLDA 5uA 1 
BS16#, BS8#, EADS#, KEN#, NMI 5uA 1 
BOFF#, RDY #, BRDY # 250uA 1 
~PCD 255uA 1 
CLK 15uA 1 
RESET 255uA 2 
A20M#, FLUSH # 5SuA 2 


Note 1: This specification is the DC loading of the emulator circuitry only and does not include any i486 leakage current. 
Note 2: This specification replaces the i486 specification for this signal. 


Emulator DC Specifications with the Isolation Board Installed 


Output High Voltage 
A2-A31, DO-D31 (lop = 15 mA) 
BEO-3 #4, M/IO#, (oH= 3mA) 
W/R#, ADS#, BLAST # (Ioy 3 mA) 
HLDA (Ipoy 3.2 mA) 
Output Low Voltage 
A2-A31, DO-D31 (Io, = 64 mA) 
BEO-3 4, M/IO#, (Io1, = 64 mA) 


W/R#, ADS#, BLAST# ([o,,=64 mA) 
HLDA (Io, = 24 mA) 
Input Leakage Current 
A2-A31, DO-D31 
Input High Current 
CLK, RESET, BRDY #, BOFF#, AHOLD 
RDY # 


Input Low Current | 
CLK, RESET, BRDY #, BOFF #, AHOLD 
RDY # 


Note 1: This specification is for the Isolation Board only and does not include any processor module loading. 
Note 2: These specifications replace the i486 specifications for this signal. 
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ICETM.486 33 MHz SPECIFICATIONS ald 
REQUIREMENTS 


PROCESSOR MODULE 
INTERFACE CONSIDERATIONS 


With the processor module directly attached to 
the target system without using the optional 
isolation board, the target system must meet 
the following requirements: - 

-¢ The bus controller must only enable data 
transceivers onto the bus during valid read 
cycles of the 486 CPU or while another bus 
master has gained access to the bus through 
the use of HOLD/HLDA or BOFF #. 

Before another bus master drives the local 
processor address bus, the other bus master 
must gain access to the address bus through 
the use of HOLD/HLDA, AHOLD or 

BOFF #. 

The user system must be able to drive one 
additional CMOS load (approximately 25pF) 
on all signals that go to the emulation 
processor. 


If the target system does not satisfy the 
restrictions, the optional isolation board 

_ should be used to isolate the emulation 

- processor from the target system. To guarantee 
proper operation with the optional isolation 
board, the clock period should be increased by 
the round trip buffer delay (10 ns) unless the 


target system design aieeady has enough © 
timing margin. 


The processor module derives its DC power | 
from the target system through the 486 CPU 
socket. It requires 2200mA, including the i486 
current. The optional isolation board requires 
an additional 500mA. The REM board requires 
an additional 2100mA. 


The processor must be socketed. The printed 
circuit board design should locate the processor 
socket at the physical ends of the printed 
circuit board traces that connect the processor 
to the other logic of the target system. This _ 
reduces transmission line noise. If the target © 
system is enclosed in a box, orient pin one of 
the processor socket to simplify connecting the 
ICB. This makes connecting the hinge cable 
easier. The ICE-486 emulator hinge cable adds 
an additional 15pF of capacitive loading and 
approximately 0.5ns of propagation delay to 
each 486 CPU signal. 


Pins specified as N.C. in the 486 CPU pin | 
description must be left unconnected. 
Connection of any of these pins to power, 
ground or any other signal may cause the 
processor or the ICE-486 emulator to 
malfunction. 


ORDERING INFORMATION | 


ICETM-486 IN-CIRCUIT EMULA TOR ORDER CODES 


ICE48633D 


ICE-486 In-circuit emulator for 80486 component. Operates to 33 MHz. Includes 


control unit, power supply, 80486 Processor Module, Stand-Alone/Self-Test 
Board, Optional Bus Isolation peared: Relocatable Expansion Memory Board, 


host software and cables. 


ICETM.486 IN-CIRCUIT EMULA TOR CON VERSION KIT ORDER 


CODES 


BASECONV886 Conversion kit to upgrade the 386 Pataily emulator base to support he 80486 


processor module. 
TOICE48633D 


Conversion kit to adapt the above upgraded base to support the 80486 


component. Includes ICE486 33 MHz Processor Module, Stand-Alone/Self-Test 
Board, Optional Bus Isolation Board, Relocatable Expansion Memory Board, 


and host software. 


ICET™-486 IN-CIRCUIT EMULATOR OPTION ORDER CODES 


DTOAB 2Mb Intel Above™ Board. 
ICE3XXCPO Clips Pod Option for ICE-486. 
ICE3XXTTB Time Tag Board Option for ICE-486. 
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intel Corp. 

5015 Bradford Dr., #2 
Huntsville 35805 

Tel: (205) 830-4010 
FAX:-(205) 837-2640 


ARIZONA 


tintel Corp. 
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Suite 500 

Phoenix 85008 

Tel: (602) 231-0386 
FAX: (602) 244-0446 


Intel Corp. 
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Tucson 85741 

Tel: (602) 544-0227 
FAX: (602) 544-0232 


CALIFORNIA 


tintel Corp. 

21515 Vanowen Street 
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Tel: (818) 704-8500 
FAX: (818) 340-1144 


tintel Corp. : 
300 N. Continental Bivd. 
Suite 100 

El Segundo 90245 

Tel: (213) 640-6040 
FAX: (213) 640-7133 


Intel Corp. 

1 Sierra Gate Plaza 
Suite 280C 
Roseville 95678 

Tal: (916) 782-8086 
FAX: (916) 782-8153 


tintel Corp. 

9665 Chesapeake Dr. 
Suite 325 

San Diego 92123. 
Tel: (619) 292-8086 
FAX: (619) 292-0628 


tintel Corp.* 

400 N. Tustin Avenue 
Suite 450 

Santa Ana 92705 

Tel: (714) 835-9642 
TWX: 910-595-1114 
FAX: (714) 541-9157 


tintel Corp.* . 

San Tomas 4 

2700 San Tomas Expressway 
2nd Floor , 
Santa Clara 95051 

Tel: (408) 986-8086 

TWX: 910-338-0255 

FAX: (408) 727-2620 


COLORADO 


intel Corp. 

4445 Northpark Drive 
Suite 100 

Colorado Springs 80907 
Tel: (719) 594-6622 
FAX: (303) 594-0720 


tintel Corp.* | 

600 S. Cherry St. 
Suite 700 

Denver 80222 
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CONNECTICUT 


tintel Corp. 
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Tel: (203) 748-3130 

FAX: (203) 794-0339 


tSales and Service Office 
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DOMESTIC SALES 


FLORIDA 


tintel Corp. 

800 Fairway Drive 
Suite 160 

Deerfield Beach 33441 
Tel: (305) 421-0506 
FAX: (305) 421-2444 


tintel Corp. 

5850 T.G. Lee Blvd. 

Suite 340 

Orlando 32822 

Tel: (407) 240-8000 

FAX: (407) 240-8097 


Intel Corp. 

11300 4th Street North 
Suite 170 

St. Petersburg 33716 
Tel: (813) 577-2413 
FAX: (813) 578-1607 


GEORGIA 


tintel Corp. 

20 Technology Parkway 
Suite 150 

Norcross 30092 

Tel: (404) 449-0541 
FAX: (404) 605-9762 


ILLINOIS 


tintel Corp.* 

Woodfield Corp. Center Ill 
300 N. Martingale Road 
Suite 400 

Schaumburg 60173 

Tel: (708) 605-8031 

FAX: (708) 706-9762 


INDIANA 


tintel Corp. 

8910 Purdue Road 
Suite 350 
Indianapolis 46268 
Tel: (317) 875-0623 
FAX: (317) 875-8938 


IOWA 
intel Corp. 


1930 St. Andrews Drive N.E. 


2nd Floor 
Cedar Rapids 52402 
Tel: (319) 393-5510 


KANSAS 


tintel Corp. 

10985 Cody St. 
Suite 140 

Overland Park 66210 
Tel: (913) 345-2727 
FAX: (913) 345-2076 


MARYLAND 


tintel Corp.* 

10010 Junction Dr. 

Suite 200 

Annapolis Junction 20701 

Tel: (801) 206-2860 

FAX: (301) 206-3677 
(301) 206-3678 


MASSACHUSETTS 


tintel Corp.* 

Westford Corp. Center 
3 Carlisle Road 

2nd Floor 

Westford 01886 

Tel: (508) 692-0960 
TWX: 710-343-6333 
FAX: (598) 692-7867 


MICHIGAN 


tintel Corp. 

7071 Orchard Lake Road 
Suite 100 

West Bloomfield 48322 
Tel: (813) 851-8096 

FAX: (313) 851-8770 


MINNESOTA 


tinte! Corp. 

3500 W. 80th St. 
Suite 360 
Bloomington 55431 
Tel: (612) 835-6722 
TWX: 910-576-2867 
FAX: (612) 831-6497 


MISSOURI 


tintel Corp. 

4203 Earth City Expressway 
Suite 131 

Earth City 63045 

Tel: (314) 291-1990 

FAX: (314) 291-4341 


NEW JERSEY 


tintel Corp.* 

Lincroft Office Center 
125 Half Mile Road 
Red Bank 07701 

Tel: (908) 747-2233 
FAX: (908) 747-0983 


Intel Corp. 

280 Corporate Center 
75 Livingston Avenue 
First Floor 

Roseland 07068 

Tel: (201) 740-0111 
FAX: (201) 740-0626 


NEW YORK 


Intel Corp.* 

850 Crosskeys Office Park 
Fairport 14450 

Tel: (716) 425-2750 

TWX: 510-253-7391 

FAX: (716) 223-2561 


tintel Corp.* 

2950 Express Dr., South 
Suite 130 

Islandia 11722 

Tel: (516) 231-3300 
TWX: 510-227-6236 
FAX: (516) 348-7939 


tintel Corp. 

300 Westage Business Center 
Suite 230 

Fishkill 12524 

Tel: (914) 897-3860 

FAX: (914) 897-3125 


Intel Corp. 

Seventeen State Street 
14th Floor 

New York 10004 

Tel: (212) 248-8086 
FAX: (212) 248-0888 


NORTH CAROLINA 


tintel Corp. 

5800 Executive Center Dr. 
Suite 105 

Charlotte 28212 

Tel: (704) 568-8966 . 
PAX: (704) 535-2236 


ttatel Corp. 

5540 Centéerview Dr. 
Suite 215 

Raleigh 27606 

Tel: (919) 851-9537 
FAX: (919) 851-8974 


OFFICES 


OHIO 


tintel Corp.* ; 
3401 Park Center Drive 
Suite 220 

Dayton 45414 

Tel: (513) 890-5350 
TWX: 810-450-2528 
FAX: (513) 890-8658 


tintel Corp.* 

25700 Science Park Dr. 
Suite 100 

Beachwood 44122 

Tel: (216) 464-2736 
TWX: 810-427-9298 
FAX: (804) 282-0673 


OKLAHOMA 


Intel Corp. 

6801 N. Broadway 
Suite 115 

Oklahoma City 73162 
Tel: (405) 848-8086 
FAX: (405) 840-9819 


OREGON 


tintel Corp. 

15254 N.W. Greenbrier Pkwy. 
Building B : 

Beaverton 97006 

Tel: (503) 645-8051 

TWX: 910-467-8741 

FAX: (503) 645-8181 


PENNSYLVANIA 


tintel Corp.* 

925 Harvest Drive 
Suite 200 

Biue Bell 19422 

Tel: (215) 641-1000 
FAX: (215) 641-0785 


tintel Corp.* 

400 Penn Center Bivd. 
Suite 610 

Pittsburgh 15235 

Tel: (412) 823-4970 
FAX: (412) 829-7578 


PUERTO RICO 


tintel Corp. 

South Industrial Park 
P.O. Box 910 

Las Piedras 00671 
Tel: (809) 733-8616 


TEXAS 


intel Corp. 

8911 N. Capital of Texas Hwy. 
Suite 4230 

Austin 78759 

Tel: (512) 794-8086 

FAX: (512) 338-9335 


tintel Corp.* 

12000 Ford Road 
Suite 400 

Dallas 75234 

Tel: (214) 241-8087 
FAX: (214) 484-1180 


tintel Corp.* 

7322 S.W. Freeway 

Suite 1490 

Houston 77074 

Tel: (713) 988-8086 

TWX: 910-881-2490 

FAX: (713) 988-3660 


UTAH 


tintel Corp. 

428 East 6400 South 
Suite 104 

Murray 84107 

Tel: (801) 263-8051 
FAX: (801) 268-1457 


VIRGINIA 


tintel Corp. 

9030 Stony Point Pkwy. 
Suite 360 

Richmond 23235 

Tel: (804) 330-9393 
FAX: (804) 330-3019 


WASHINGTON 


tintel Corp. 

155 108th Avenue N.E. 
Suite 386 

Bellevue 98004 

Tel: (206) 453-8086 
TWX: 910-443-3002 
FAX: (206) 451-9556 


Intel Corp. 

408 N. Mullan Road 
Suite 102 

Spokane 99206 

Tel: (509) 928-8086 
FAX: (509) 928-9457 


WISCONSIN 


Intel Corp. 

330 S. Executive Dr. 
Suite 102 

Brookfield 53005 
Tel: (414) 784-8087 
FAX: (414) 796-2115 


CANADA 


BRITISH COLUMBIA 


Intel Semiconductor of 
Canada, Ltd. 

4585 Canada Way 
Suite 202 

Burnaby V5G 4L6 

Tel: (604) 298-0387 
FAX: (604) 298-8234 


ONTARIO 


tintel Semiconductor of 
Canada, Ltd. 

2650 Queensview Drive 
Suite 250 

Ottawa K2B 8H6 

Tel: (613) 829-9714 
FAX: (613) 820-5936 


tintel Semiconductor of 
Canada, Ltd. 

190 Attweill Drive 

Suite 500 

Rexdale M9W 6H8 

Tel: (416) 675-2105 
FAX: (416) 675-2438 


QUEBEC 


tintel Semiconductor of 
Canada, Ltd. 

1 Rue Holiday 

Suite 115 

Tour East 

Pt. Claire HSR 5N3 

Tel: (514) 694-9130 
FAX: 514-694-0064 


in 


ALABAMA 


Arrow Electronics, Inc. 
1015 Henderson Road 
Huntsville 35805. 
Tel: (205) 837-6955 
FAX: 205-751-1581 


Hamilton/Avnet Computer 
4930 | Corporate Drive 
Huntsville 35805 


Hamilton/Avnet Electronics 
4940 Research Drive 
Huntsville 35805 

Tel: (205) 837-7210 

FAX: 205-721-0356 


MTI Systems Sales 
4950 Corporate Drive 
Suite 120 

Huntsville 35806. 
Tel: (205) 830-9526 
FAX: (205) 830-9557 


Pioneer/Technologies Group, Inc. 
4825 University Square 
Huntsville 35805 

Tel: (205) 837-9300 

FAX: 205-837-9358. 


ALASKA 


Hamilton/Avnet Computer 
1400 W. Benson Bivd., Suite 400 
Anchorage 99503 


ARIZONA 


tArrow Electronics, Inc. 
4134 E. Wood Street. 
Phoenix 85040 

Tel: (602) 437-0750 
TWX: 910-951-1550 


Hamilton/Avnet Computer 
30 South McKemy Avenue 
Chandler 85226 


Hamilton/Avnet Computer 
90 South McKemy Road 
Chandler 85226 


tHamilton/Avnet Electronics 
505 S. Madison Drive 
Tempe 85281 

Tel: (602) 231-5140 

TWX: 910-950-0077 


Hamilton/Avnet Electronics 
30 South McKemy 
Chandler 85226 

Tel: (602) 961-6669 

FAX: 602-961-4073 


Wyle Distribution Group 
4141 E. Raymond 
Phoenix 85040 

Tel: (602) 249-2232 
TWX: 910-371-2871 


CALIFORNIA 


Arrow Commercial System Group 
1502 Crocker Avenue 

Hayward 94544 

Tel: (415) 489-5371 

FAX: (415) 489-9393 


Arrow Commercial System Group 
14242 Chambers Road 

Tustin 92680 

Tel: (714) 544-0200 | 

FAX: (714) 731-8438 


tArrow Electronics, tnc. 
19748 Dearborn Street 
Chatsworth 91311 

Tel: (213) 701-7500 
TWX: 910-493-2086 


tArrow Electronics, Inc. 
9511 Ridgehaven Court 
San Diego 92123 

Tei: (619) 565-4800 
FAX: 619-279-8062 


tArrow Electronics, Inc. 
521 Weddell Drive 

‘ Sunnyvale 94086 

Tel: (408) 745-6600 
TWX: 910-339-9371 


+Certified Technical Distributor 


DOMESTIC DISTRIBUTORS 


tArrow Electronics, inc. 
2961 Dow Avenue 
Tustin 92680 

Tel: (714) 838-5422 
TWX: 910-595-2860 


Hamilton/Avnet Computer 
3170 Pullman Street 
Costa Mesa 92626 


Hamilton/Avnet Computer 
1361B West 190th Street 
Gardena 90248 


Hamilton/Avnet Computer 
4103 Northgate Bivd. 
Sacramento 95834 


Hamilton/Avnet Computer 
4545 Viewridge Avenue 
San Diego 92123. 


Hamilton/Avnet Computer 
1175 Bordeaux Drive 
Sunnyvale 94089 


Hamilton/Avnet Electronics 
21150 Califa Street 
Woodland Hills 91367 


tHamilton/Avnet Electronics 
3170 Pullman Street 

Costa Mesa 92626 

Tel: (714) 641-4150 

TWX: 910-595-2638 


tHamilton/Avnet Electronics 
1175 Bordeaux Drive 
Sunnyvale 94086 

Tel: (408) 743-3300 

TWX: 910-339-9332 


tHamilton/Avnet Electronics 
4545 Ridgeview Avenue 
San Diego 92123 

Tel: (619) 571-7500 

TWX: 910-595-2638 


tHamilton/Avnet Electronics 
21150 Califa St. 

Woodland Hills 91376 

Tel: (818) 594-0404 

FAX: 818-594-8233 


tHamilton/Avnet Electronics 
10950 W. Washington Blvd. 
Culver City 20230 

Tel: (213) 558-2458 

TWX: 910-340-6364 


tHamilton/Avnet Electronics 
1361B West 190th Street 
Gardena 90248 

Tel: (213) 217-6700 

TWX: 910-340-6364 


tHamilton/Avnet Electronics 
4103 Northgate Blvd. 
Sacramento 95834 

Tel: (916) 920-3150 


Pioneer/Technologies Group, Inc. 


134 Rio Robles 
San Jose 95134 
Tel: (408) 954-9100 


. FAX: 408-954-9113 


Wyle Distribution Group 
124 Maryland Street 

El Segundo 90254 

Tel: (213) 322-8100 


Wyle Distribution Group 
7431 Chapman Ave. 
Garden Grove 92641 
Tel: (714) 891-1717 
FAX? 714-891-1621 


tWyle Distribution Group 
2951 Sunrise Blvd., Suite 175 
Rancho Cordova 95742 

Tel: (916) 638-5282 


tWyle Distribution Group 
9525 Chesapeake Drive 
San Diego 92123 

Tel: (619) 565-9171 
TWX: 910-335-1590 


tWyle Distribution Group 
3000 Bowers Avenue 
Santa Clara 95051 

Tel: (408) 727-2500 
TWX: 408-988-2747 


tWyle Distribution Group 
17872 Cowan Avenue 
Irvine 92714 


. Tel: (714) 863-9953 


TWX: 910-371-7127 


tWyle Distribution Group 
26677 W. Agoura Rd. 
Calabasas 91302 

Tel: (818) 880-9000 
TWX: 372-0232 


COLORADO 


Arrow Electronics, Inc. 
7060 South Tucson Way 
Englewood 80112 

Tel: (303) 790-4444 


Hamilton/Avnet Computer 
9605 Maroon Circle, Ste. 200 
Engelwood 80112 


tHamilton/Avnet Electronics 
9605 Maroon Circle 

Suite 200 

Englewood 80112 

Tel: (303) 799-0663 

TWX: 910-935-0787 


tWyle Distribution Group 
451 E. 124th Avenue 
Thornton 80241 
Tel: (303) 457-9953 
TWX: 910-936-0770 


CONNECTICUT 


tArrow Electronics, !nc. 
12 Beaumont Road 
Wallingford 06492 

Tel: (203) 265-7741 
TWX: 710-476-0162 


Hamilton/Avnet Computer 


-Commerce Industrial Park 


Commerce Drive 
Danbury 06810 


tHamilton/Avnet Electronics 
Commerce Industrial Park 
Commerce Drive 

Danbury 06810 

Tel: (203) 797-2800 

TWX: 710-456-9974 


-+Pioneer/Standard Electronics 


112 Main Street 
Norwalk 06851 

Tel: (203) 853-1515 
FAX: 203-838-9901 


FLORIDA 


tArrow Electronics, Inc. 
400 Fairway Drive 
Suite 102 

Deerfield Beach 33441 
Tel: (305) 429-8200 
FAX: 305-428-3991 


tArrow Electronics, Inc. 
37 Skyline Drive 

Suite 3101 . 

Lake Marv 32746 

Tel: (407) 323-0252 
FAX: 407-323-3189 


Hamilton/Avnet Computer 
6801 N.W. 15th Way 
Ft. Lauderdale 33309 


Hamilton/Avnet Computer 
3247 Spring Forest Road 
St. Petersburg 33702 


+Hamilton/Avnet Electronics 
6801 N.W. 15th Way 

Ft. Lauderdale 33309 

Tel: (305) 971-2900 

FAX: 305-971-5420 


tHamilton/Avnet Electronics 
3197 Tech Drive North 

St. Petersburg 33702 

Tel: (813) 573-3930 

FAX: 813-572-4329 


tHamilton/Avnet Electronics 
6947 University Boulevard 
Winter Park 32792 

Tel: (407) 628-3888 

FAX: 407-678-1878 


’ 485 Gradle 


tPioneer/Technologies Group, Inc. 
337 Northlake Bivd., Suite 1000 
Alta Monte Springs 32701 

Tel: (407) 834-9090 

FAX: 407-834-0865 


Pioneer/T echnologies Group, Inc. 
674 S. Military Trail 

Deerfield Beach 33442 

Tel: (305) 428-8877 

FAX: 305-481-2950 


GEORGIA 


Arrow Commercial System Group 
3400 C. Corporate Way 

Deluth 30139 

Tel: (404) 623-8825 


_ FAX: (404) 623-8802 


tArrow Electronics, Inc. 
4250 E. Rivergreen Parkway 
Deluth 30136 

Tel: (404) 497-1300 

TWX: 810-766-0439 


Hamilton/Avnet Computer 
5825 D. Peachtree Corners E. 


Norcross 30092 


tHamilton/Avnet Electronics 
5825 D Peachtree Corners 
Norcross 30092 ° 

Tel: (404) 447-7500 

TWX: 810-766-0432 . 


Pioneer/Technologies Group, Inc. 
3100 F Northwoods Place 
Norcross 30071 

Tel: (404) 448-1711 

FAX: 404-446-8270 


ILLINOIS 


tArrow Electronics, Inc. 
1140 W. Thorndale 
Itasca 60143 

Tel: (708) 250-0500 
TWX: 708-250-0916 


Hamilton/Avnet Computer 
1130 Thorndale Avenue 
Bensenville 60106 


tHamilton/Avnet Electronics 
1130 Thorndale Avenue 
Bensenville 60106 

Tel: (708) 860-7780 

TWX: 708-860-8530 


MTI Systems Sales 
1100 W. Thorndale 
Itasca 60143 

Tel: (708) 773-2300 


tPioneer/Standard Electronics 
2171 Executive Dr., Suite 200 
Addison 60101 

Tel: (708) 495-9680 

FAX: 708-495-9831 


INDIANA 


tArrow Electronics, Inc. 

7108 Lakeview Parkway West Drive 
indianapolis 46268 

Tel: (317) 299-2071 

FAX: 317-299-0255 


Fee lt Computer 
rive | 
Carmel 46032 


Hamilton/Avnet Electronics 
485 Gradle Drive 

Carmel 46032 

Tel: (817) 844-9333 

FAX: 317-844-5921 


tPioneer/Standard Electronics 
9350 Priority Way 

West Drive 

Indianapolis 46250 

Tel: (317) 573-0880 

FAX: 317-573-0979 


in 


iOWA 


Hamilton/Avnet Computer 
915 33rd Avenue SW 
Cedar Rapids 52404 


Hamilton/Avnet Electronics 
915 33rd Avenue, S.W. 
Cedar Rapids 52404 

Tel: (319) 362-4757 


KANSAS 


Arrow Electronics, Inc. 
8208 Melrose Dr., Suite 210 
Lenexa 66214 

Tel: (913) 541-9542 

FAX: 913-541-0328 


Hamilton/Avnet Computer 
15313 W. 95th Street 
Lenexa 61219 


tHamilton/Avnet Electronics 
15313 W. 95th 

Overland Park 66215 

Tel: (913) 888-8900 

FAX: 913-541-7951 


KENTUCKY 


Hamilton/Avnet Electronics 
805 A. Newtown Circle 
Lexington 40511 

Tel: (606) 259-1475 


MARYLAND 


tArrow Electronics, Inc. 
8300 Guilford Drive 
Suite H, River Center 
Columbia 21046 

Tel: (301) 995-6002 
FAX: 301-381-3854 


Hamilton/Avnet Computer 
6822 Oak Hall Lane 
Columbia 21045 


tHamilton/Avnet Electronics 
6822 Oak Hall Lane 
Columbia 21045 

Tel: (301) 995-3500 

FAX: 301-995-3593 


tMesa Technology Corp. 
9720 Patuxent Woods Dr. 
Columbia 21046 

Tel: (301) 290-8150 

FAX: 301-290-6474 


tPioneer/Technologies Group, Inc. 


9100 Gaither Road 
Gaithersburg 20877 
Tel: (301) 921-0660 
FAX: 301-921-4255 


MASSACHUSETTS 


Arrow Electronics, Inc. 
25 Upton Dr. 
Wilmington 01887 
Tel: (608) 658-0900 
TWX: 710-393-6770 


Hamilton/Avnet Computer 
10 D Centennial Drive 
Peabody 01960 


tHamilton/Avnet Electronics 
10D Centennial Drive 
Peabody 01960 

Tel: (508) 532-9838 

FAX: 508-596-7802 


tPioneer/Standard Electronics 
44 Hartwell Avenue 

Lexington 02173 

Tel: (617) 861-9200 

FAX: 617-863-1547 


Wyle Distribution Group 
15 Third Avenue 
Burlington 01803 

Tel: (617) 272-7300 
FAX: 617-272-6809 


MICHIGAN 


tArrow Electronics, Inc. 
19880 Haggerty Road 
Livonia 48152 

Tel: (813) 665-4100 
TWX: 810-223-6020 


tCertified Technical Distributor 


DOMESTIC DISTRIBUTORS (Contd.) 


Hamilton/Avnet Computer 
2215 S.E. A-5 
Grand Rapids 49508 


Hamilton/Avnet Computer 
41650 Garden Rd., Ste. 100 
Novi 48050 


Hamilton/Avnet Electronics 
2215 29th Street S.E. 
Space A5 

Grand Rapids 49508 

Tel: (616) 243-8805 

FAX: 616-698-1831 


Hamilton/Avnet Electronics 
41650 Garden Brook 

Novi 48050 

Tel: (313) 347-4271 

FAX: 313-347-4021 


tPioneer/Standard Electronics 
4505 Broadmoor S.E. 

Grand Rapids 49508 

Tel: (616) 698-1800 

FAX: 616-698-1831 


tPioneer/Standard Electronics 
13485 Stamford 

Livonia 48150 

Tel: (313) 525-1800 

FAX: 313-427-3720 


MINNESOTA 


tArrow Electronics, Inc. 
5230 W. 73rd Street 
Edina 55435 

Tel: (612) 830-1800 
TWX: 910-576-3125 


Hamilton/Avnet Computer 
12400 Whitewater Drive 
Minnetonka 55343 


tHamilton/Avnet Electronics 
12400 Whitewater Drive 
Minnetonka 55434 

Tel: (612) 932-0600 

TWX: 910-576-2720 


tPioneer/Standard Electronics 
7625 Golden Triange Dr. 
Suite G 

Eden Prairie 55343 

Tel: (612) 944-3355 

FAX: 612-944-3794 


MISSOURI 


tArrow Electronics, Inc. 
2380 Schuetz 

St. Louis 63141 

Tel: (314) 567-6888 
FAX: 314-567-1164 


Hamilton/Avnet Computer 
739 Goddard Avenue 
Chesterfield 63005 


tHamilton/Avnet Electronics 
741 Goddard 

Chesterfield 63005 

Tel: (314) 537-1600 

FAX: 314-537-4248 


NEW HAMPSHIRE 


Hamilton/Avnet Computer 
2 Executive Park Drive 
Bedford 03102 . 


Hamilton/Avnet Computer 
444 East Industrial Park Dr. 
Manchester 03103 


NEW JERSEY 


tArrow Electronics, Inc. 
4 East Stow Road 

Unit 11 

Marlton 08053 

Tel: (609) 596-8000 
FAX: 609-596-9632 


tArrow Electronics 
-6 Century Drive 
Parsipanny 07054 
Tel: (201) 538-0900 
FAX: 201-538-0900 


Hamilton/Avnet Computer 
1 Keystone Ave., Bidg. 36 
Cherry Hill 08003 


Hamilton/Avnet Computer 
10 Industrial Road 
Fairfield 07006 


tHamilton/Avnet Electronics 
1 Keystone Ave., Bidg. 36 
Cherry Hill 08003 

Tel: (609) 424-0110 

FAX: 609-751-2552 


tHamilton/Avnet Electronics 
10 Industrial 

Fairfield 07006 

Tel: (201) 575-3390 

FAX: 201-575-5839 


tMTI Systems Sales 
9 Law Drive 
Fairfield 07006 

Tel: (201) 227-5552 
FAX: 201-575-6336 


tPioneer/Standard Electronics 
14-A Madison Rd. 

Fairfield 07006 

Tel: (201) 575-3510 

FAX: 201-575-3454 


NEW MEXICO 


Alliance Electronics inc. 
10510 Research Avenue 
Albuquerque 87123 
Tel: (505) 292-3360 
FAX: 505-292-6537 


Hamilton/Avnet Computer 
5659 Jefferson, N.E. Suites A & B 
Albuquerque 87109 


tHamilton/Avnet Electronics 
5659A Jefferson N.E. 
Albuquerque 87109 

Tel: (505) 765-1500 

FAX: 505-243-1395 


NEW YORK 
tArrow Electronics, Inc. 


3375 Brighton Henrietta Townline Rd. 


Rochester 14623 
Tel: (716) 427-0300 
TWX: 510-253-4766 


Arrow Electronics, Inc. 
20 Oser Avenue 
Hauppauge 11788 
Tel: (616) 231-1000 
TWX: 510-227-6623 


Hamilton/Avnet Computer 
933 Motor Parkway 
Haupauge 11788 


Hamilton/Avnet Computer 
2060 Townline 
Rochester 14623 


tHamilton/Avnet Electronics 
933 Motor Parkway 
Hauppauge 11788 

Tel: (516) 231-9800 

TWX: 510-224-6166 


tHamilton/Avnet Electronics 
2060 Townline Rd. 
Rochester 14623 

Tel: (716) 272-2744 

TWX: 510-253-5470 


Hamilton/Avnet Electronics 
103 Twin Oaks Drive 
Syracuse 13206 

Tel: (315) 437-0288 

TWX: 710-541-1560 


tMTI Systems Sales 
38 Harbor Park Drive 
Port Washington 11050 
Tel: (516) 621-6200 
FAX: 510-223-0846 


Pioneer/Standard Electronics 
68 Corporate Drive 
Binghamton 13904 
Tel: (607) 722-9300 
FAX: 607-722-9562 


Pioneer/Standard Electronics 
40 Oser Avenue 

Hauppauge 11787 

Tel: (516) 231-9200 

FAX: 510-227-9869 


tPioneer/Standard Electronics 
60 Crossway Park West 
Woodbury, Long Island 11797 
Tel: (516) 921-8700 

FAX: 516-921-2143 


tPioneer/Standard Electronics 
840 Fairport Park 

Fairport 14450 

Tel: (716) 381-7070 

FAX: 716-381-5955 


NORTH CAROLINA 


tArrow Electronics, Inc. 
5240 Greensdairy Road 
Raleigh 27604 

Tel: (919) 876-3132 
TWX: 510-928-1856 


Hamilton/Avnet Computer 
3510 Spring Forest Road 
Raleigh 27604 


tHamilton/Avnet Electronics 
3510 Spring Forest Drive 
Raleigh 27604 

Tel: (919) 878-0819 

TWX: 510-928-1836 


Pioneer/Technologies Group, Inc. 


9401 L-Southern Pine Bivd. 
Charlotte 28210 

Tel: (919) 527-8188 

FAX: 704-522-8564 


Pioneer Technologies Group, Inc. 


2810 Meridian Parkway 
Suite 148 

Durham 27713 

Tel: (919) 544-5400 
FAX: 919-544-5885 


OHIO 


Arrow Commercial System Group 


284 Cramer Creek Court 
Dublin 43017 

Tel: (614) 889-9347 
FAX: (614) 889-9680 


.TArrow Electronics, Inc. 


6238 Cochran Road 
Solon 44139 

Tel: (216) 248-3990 
TWX: 810-427-9409 


Hamilton/Avnet Computer 
7764 Washington Village Dr. 
Dayton 45459 


Hamilton/Avnet Computer 
30325 Bainbridge Rd., Bidg. A 
Solon 44139 


tHamiiton/Avnet Electronics 
7760 Washington Village Dr. 
Dayton 45459 

Tel: (513) 439-6733 

FAX: 513-439-6711 


tHarnilton/Avnet Electronics 
30325 Bainbridge 

Solon 44139 

Tel: (216) 349-5100 

TWX: 810-427-9452 | 


Hamilton/Avnet Computer 
777 Brooksedge Bivd. 
Westerville 43081 

Tel: (614) 882-7004 

FAX: 614-882-8650 


Hamilton/Avnet Electronics 
777 Brooksedge Blvd. 
Westerville 43081 

Tel: (614) 882-7004 


MTI Systems Sales 
23400 Commerce Park Road 
Beachwood 44122 
Tel: (216) 464-6688 


tPioneer/Standard Electronics 
4433 Interpoint Boulevard 
Dayton 45424 

Tel: (513) 236-9900 

FAX: 513-236-8133 


tPioneer/Standard Electronics 
4800 E. 131st Street 
Cleveland 44105 

Tel: (216) 587-3600 

FAX: 216-663-1004 


in 


OKLAHOMA 


Arrow Electronics, Inc. 
4719 South Memorial Dr. 
Tulsa 74145 


tHamilton/Avnet Electronics 
12121 E. 51st St., Suite 102A 
Tulsa 74146 

Tel: (918) 252-7297 


OREGON 


tAlmac Electronics Corp. 
1885 N.W. 169th Place 
Beaverton 97005 

Tel: (503) 629-8090 
FAX: 503-645-061 1 


Hamilton/Avnet Computer 
9409 Southwest Nimbus Ave. 
Beaverton 97005 


tHamilton/Avnet Electronics 
9409 S.W. Nimbus Ave. 
Beaverton 97005 

Tel: (503) 627-0201 

FAX: 503-641-4012 


Wyle 

9640 Sunshine Court 
Bidg. G, Suite 200 
Beaverton 97005 

Tel: (503) 643-7900 
FAX: 503-646-5466 


PENNSYLVANIA 


Arrow Electronics, Inc. 
650 Seco Road 
Monroeville 15146 
Tel: (412) 856-7000 


Hamilton/Avnet Computer 
2800 Liberty Ave., Bidg. 
Pittsburgh 15222 - 


Hamilton/Avnet Electronics 
2800 Liberty Ave. 
Pittsburgh 15238 

Tel: (412) 281-4150 


Pioneer/Standard Electronics 
259 Kappa Drive 

Pittsburgh 15238 . 

Tel: (412) 782-2300 

FAX: 412-963-8255 


tPioneer/Technologies Group, Inc. - 


Delaware Valley 
261 Gibralter Road 
Horsham 19044 
Tel: (215) 674-4000 
FAX: 215-674-3107 


TENNESSEE 


Arrow Commercial System Group 
3635 Knight Road 
Suite 7 
Memphis 38118 
Tel: (901) 367-0540 
FAX: (901) 367-2081 


TEXAS 


Arrow Electronics, Inc. 
3220 Commander Drive 
Carrollton 75006 

Tel: (214) 380-6464 
FAX: (214) 248-7208 


tCertified Technical Distributor 


Hamilton/Avnet Computer 
1807A West Braker Lane 
Austin 78758 


Hamilton/Avnet Computer 
Forum 2 

4004 Beltline, Suite 200 
Dallas 75244 


Hamilton/Avnet Computer 
4850 Wright Rd., Suite 190 
Stafford 77477 


tHamilton/Avnet Electronics 
1807 W. Braker Lane 
Austin 78758 

Tel: (512) 837-8911 

TWX: 910-874-1319 


tHamilton/Avnet Electronics 
4004 Beltline, Suite 200 
Dallas 75234 

Tel: (214) 308-8111 

TWX: 910-860-5929 


tHamilton/Avnet Electronics 
4850 Wright Rd., Suite 190 
Stafford 77477 

Tel: (713) 240-7733 

TWX: 910-881-5523 


tPioneer/Standard Electronics 
1826-D Kramer 

Austin 78758 

Tel: (512) 835-4000 

FAX: 512-835-9829 


tPioneer/Standard Electronics 
13710 Omega Road 

Dallas 75244 

Tel: (214) 386-7300 

FAX: 214-490-6419 


+Pioneer/Standard Electronics 
10530 Rockley Road 

Houston 77099 

Tel: (713) 495-4700 

FAX: 713-495-5642 


tWyle Distribution Group 
1810 Greenville Avenue 
Richardson 75081 

Tel: (214) 235-9953 
FAX: 214-644-5064 


UTAH 


Hamilton/Avnet Computer 
1585 West 2100 South 
Salt Lake City 84119 


tHamilton/Avnet Electronics 
1585 West 2100 South 

Salt Lake City 84119 

Tel: (801) 972-2800 

TWX: 910-925-4018 


tWyle Distribution Group 
1325 West 2200 South 
Suite E 

West Valley 84119 

Tel: (801) 974-9953 


WASHINGTON 


tAlmac Electronics Corp. | 
14360 S.E. Eastgate Way 
Bellevue 98007 

Tel: (206) 643-9992 

FAX: 206-643-9709 


Hamilton/Avnet Computer 


17761 Northeast 78th Place | 


Redmond 98052 


tHamilton/Avnet Electronics 
17761 N.E. 78th Place 
Redmond 98052 

Tel: (206) 881-6697 

FAX: 206-867-0159 


Wyle Distribution Group 
15385 N.E. 90th Street 
Redmond 98052 

Tel: (206) 881-1150 
FAX: 206-881-1567 


WISCONSIN 


Arrow Electronics, Inc. 

200 N. Patrick Bivd., Ste. 100 
Brookfield 53005 

Tel: (414) 792-0150 

FAX: 414-792-0156 


Hamilton/Avnet Computer 
20875 Crossroads Circle 
Suite 400 

Waukesha 53186 


tHamilton/Avnet Electronics 
28875 Crossroads Circle 
Suite 400 

Waukesha 53186 

Tel: (414) 784-4510 

FAX: 414-784-9509 


CANADA 


ALBERTA 


Hamilton/Avnet Computer 
2816 21st Street Northeast 
Calgary T2E 622 


Hamiiton/Avnet Electronics 
2816 21st Street N.E. #3 
Calgary T2E 623 

Tel: (403) 230-3586 

FAX: 403-250-1591 


Zentronics 

6815 #8 Street N.E. 
Suite 100 

Calgary T2E 7H 

Tel: (403) 295-8818 
FAX: 403-295-8714 


BRITISH COLUMBIA 


tHamilton/Avnet Electronics 
8610 Commerce Ct. 
Burnaby V5A 4N6 

Tel: (604) 420-4101 

FAX: 604-437-4712 


Zentronics 

108-11400 Bridgeport Road 
Richmond V6X 1T2 

Tel: (604) 273-5575 

FAX: 604-273-2413 


ONTARIO 


Arrow Electronics, !nc. 
36 Antares Dr., Unit 100 
Nepean K2E 7W5 

Tel: (613) 226-6903 
FAX: 613-723-2018 


DOMESTIC DISTRIBUTORS (Contd.) 


tArrow Electronics, Inc. 
1093 Meyerside, Unit 2 
Mississauga L5T 1M4 
Tel: (416) 673-7769 
FAX: 416-672-0849 


Hamilton/Avnet Computer 
Canada System Engineering 
Group 

3688 Nashua Drive 

Units 7 & 8 

Mississuaga L4V 1M5 


Hamilton/Avnet Computer 
3688 Nashua Drive 

Units 9 & 10 

Mississuaga L4V 1M5 


Hamilton/Avnet Computer 
6845 Rexwood Road 
Units 7, 8, &9 
Mississuaga L4V 1R2 


Hamilton/Avnet Computer 
190 Colonade Road 
Nepean K2E 7J5 © 


tHamilton/Avnet Electronics 
6845 Rexwood Road 

Units 3-4-5 

Mississauga L4T 1R2 

Tel: (416) 677-7432 

FAX: 416-677-0940 


tHamilton/Avnet Electronics 
190 Colonnade Road South 
Nepean K2E 7L5 

Tel: (613) 226-1700 

FAX: 613-226-1184 


tZentronics 

1355 Meyerside Drive 
Mississauga L5T 1C9 
Tel: (416) 564-9600 
FAX: 416-564-8320 


+Zentronics 
155 Colonnade Road 
Unit 17 — 

Nepean K2E 7K1 

Tel: (613) 226-8840 . 
FAX: 613-226-6352 


QUEBEC 


Arrow Electronics Inc. 
1100 St. Regis 
Dorval H9P 2T5 

Tel: (514) 421-7411 
FAX: 514-421-7430 


Arrow Electronics, Inc. 
500 Boul. St-Jean-Baptiste 
Suite 280 

Quebec G2E 5R9 

Tel: (418) 871-7500 

FAX: 418-871-6816 


Hamilton/Avnet Computer 
2795 Rue Halpern 
St. Laurent H4S 1P8 


tHamilton/Avnet Electronics 
2795 Halpern 

St. Laurent H2E 7K1 

Tel: (514) 335-1000 

FAX: 514-335-2481 


tZentronics 

520 McCaffrey 

St. Laurent H4T 1N3 | 
Tel: (514) 737-9700 
FAX: 514-737-5212 — 


in 


FINLAND 


Intel Finland OY 
Ruosilantie 2 

00390 Helsinki 

Tel: (358) 0 544 644 
TLX: 123332 


FRANCE 


intel Corporation S.A.R.L. 

1, Rue Edison-BP 303 

78054 St. Quentin-en-Yvelines 
Cedex 

Tel: ee). We 30 57 70 00 
TLX: 6 


ISRAEL 
Intel Semiconductor Ltd. 


Atidim Industrial Park-Neve Sharet 


P.O. Box 43202 
Tel-Aviv 61430 

Tel: (972) 03-498080 
TLX: 371215 


EUROPEAN SALES OFFICES 


ITALY 


Intel Corporation Italia S.p.A. 
Milanofiori Palazzo E 

20094 Assago 

Milano 

Tel: (39) (02) 89200950 
TLX: 341286 


NETHERLANDS 


Intel Semiconductor B.V. 
Postbus 84130 

3099 CC Rotterdam 

Tel: (31) 10.407.11.11 
TLX: 22283 


SPAIN 


“Intel Iberia S.A. 


Zurbaran, 28 

28010 Madrid 

Tel: (34) (1) 308.25.52 
TLX: 46880 


SWEDEN 


Intel Sweden A.B. 
Dalvagen 24 

171 36 Solna 

Tel: (46) 8 734 01 00 
TLX: 12261 


SWITZERLAND 


Intel Semiconductor A.G. 
Zuerichstrasse 

8185 Winkel-Rueti bei Zuerich 
Tel: (41) 01/860 62 62 

TLX: 825977 


UNITED KINGDOM 


Intel Corporation (U.K.) Ltd. 
Pipers Way 

Swindon, Wiltshire SN3 1RJ 
Tel: (44) (0793) 696000 
TLX: 444447/8 


WEST GERMANY 


intel GmbH 

Dornacher Strasse 1 

8016 Feldkirchen bei Muenchen 
Tel: (49) 089/90992-0 

FAX: (49) 089/904/3948 


intel GmbH 

Abraham Lincoin Strasse 16-18 
6200 Wiesbaden 

Tel: (49) 06121/7605-0 

TLX: 4-186183 


Intel GmbH 

Zettachring 10A 

7000 Stuttgart 80 

Tet: (49) 0711/7287-280 
TLX: 7-254826 


EUROPEAN DISTRIBUTORS/REPRESENTATIVES 


AUSTRIA 


Bacher Electronics G.m.b.H. 
Rotenmuehligasse 26 

1120 Wien 

Tel: (43) (0222) 83 56 46 
TLX: 31532 


BELGIUM 


inelco Belgium S.A. 

Av. des Croix de Guerre 94 
1120 Bruxelles 
Oorlogskruisenlaan, 94 
1120 Brussel 

Tel: (32) (02) 216 01 60 
TLX: 64475 or 22090 


DENMARK 


ITT-Multikomponent 
Naverland 29 

2600 Glostrup 

Tel: (45) (0) 2 45 66 45 
TLX: 33 355 


FINLAND 


OY Fintronic AB 
Melkonkatu 24A 
00210 Helsinki 

Tel: eas 0) 6926022 
TLX: 1 


FRANCE 


Almex 

Zone industrielle d’Antony 
48, rue de l’Aubepine 

BP 102 


92164 Antony cedex 
Tel: (33) (1) 46 66 21 12 
TLX: 250067 


Jermyn 

60, rue des Gemeaux 
Silic 580 

94653 Rungis Cedex 
Tel: (33) (1) 49 78 49 78 
TLX: 261585 


Metrologie 

Tour d’Asnieres 

4, av. Laurent-Cely 
92606 Asnieres Cedex 
Tel: (33) (1) 47 90 62 40 
TLX: 611448 


Tekelec-Airtronic 

Cite des Bruyeres 

Rue Carle Vernet - BP 2 
92310 Sevres 

Tel: (33) (1) 45 34 75 35 
TLX: 204552 


IRELAND 


Micro Marketing Ltd. 
Glenageary Office Park 
Glenageary 

Co. Dublin 

Tel: (21) (353) (01) 856288 
FAX: (21) (353) (01) 857364 
TLX: 31584 


ISRAEL 

Eastronics Ltd. 

11 Rozanis Street 
P.O.B. 39300 
Tel-Aviv 61392 

Tel: (972) 03-475151 
TLX: 33638 


“ITALY 


Intesi 

Divisione ITT Industries GmbH 
Viale Milanofiori 

Palazzo E/5 

20090 Assago (MI) 

Tel: (39) 02/824701 

TLX: 311351 


Lasi Elettronica S.p.A. 

V. le Fulvio Testi, 126 

20092 Cinisello Balsamo (Ml) 
Tel: (39) 02/2440012 

TLX: 352040 


Telcom S.r.1. 
Via M. Civitali 75 
20148 Milano 


‘Tel: (39) 02/4049046 


TLX: 335654 


ITT Multicomponents 
Viale Milanofiori E/5 
20090 Assago (Ml) 
Tel: (39) 02/824701 
TLX: 311351 


Silverstar 

Via Dei Gracchi 20 
20146 Milano 

Tel: (39) 02/49961 
TLX: 332189 


NETHERLANDS 


Koning en Hartman 
Elektrotechniek B.V. 
Energieweg 1 

2627 AP Delft 

Tel: (31) (1) 15/609906 
TLX: 38250 


NORWAY 


Nordisk Elektronikk (Norge) A/S 
Postboks 123 

Smedsvingen 4 

1364 Hvalstad 

Tel: (47) (02) 84 62 10 

TLX: 77546 


PORTUGAL 


ATD Portugal LDA 

Rua Dos Lusiados, 5 Sala B 
1300 Lisboa 

Tel: (35) (1) 64 80 91 

TLX: 61562 


 Ditram 


Avenida Miguel Bombarda, 133 
1000 Lisboa 

Tel: (35) (1) 54 53 13 

TLX: 14182 


SPAIN 


ATD Electronica, S.A. 
Piaza Ciudad de Viena, 6 
28040 Madrid 

Tel: (34) (1) 234 40 00 
TLX: 42477 


Metrologia Iberica, S.A. 
Ctra. de Fuencarral, n.80 
28100 Alcobendas (Madrid) 
Tel: (34) (1) 653 86 11 


SWEDEN 


Nordisk Elektronik AB 
Torshamnsgatan 39 
Box 36 

164 93 Kista 

Tel: (46) 08-03 46 30 
TLX: 105 47 


SWITZERLAND 


Industrade A.G. 
Hertistrasse 31 

8304 Wallisellen 

Tel: (41) (01) 8328111 
TLX: 56788 


TURKEY 


EMPA Electronic 
Lindwurmstrasse 95A 
8000 Muenchen 2 

Tel: (49) 089/53 80 570 
TLX: 528573 


UNITED KINGDOM 


Accent Electronic Components Ltd. 


Jubilee House, Jubilee Road 
Letchworth, Herts SG6 1QH 


- Tel: (44) (0462) 670011 


FAX: (44) (0462) 682467 
TWX: 826505 


Bytech Components Ltd. 
12A Cedarwood 
Chineham Business Park 
Crockford Lane 
Basingstoke 

Hants RG24 OWD 

Tel: (0256) 707107 

FAX: 0256-707162 


Conformix 

Unit 5 ; 

A1M Business Centre 
Dixons Hill Road 
Welham Green 

South Hatfield 

Herts ALO 7JE 

Tel: (07072) 73282 
FAX: (07072) 61678 


' Bytech Systems 


3 The Western Centre 
Western Road 

Bracknell RG12 1RW 
Tel: (44) (0344) 55333 
FAX: (44) (0344) 867270 
TWX: 849624 


Jermyn 

Vestry Estate 

Otford Road 

Sevenoaks 

Kent TN14 5EU 

Tel: (44) (0732) 450144 
FAX: (44) (0732) 451251 
TWX: 95142 


MMD Ltd. 

3 Bennet Court 

Bennet Road 

Reading 

Berkshire RG2 0QX 

Tel: (44) (0734) 313232 
FAX: (44) (0734) 313255 
TWX: 846669 


Rapid Recall, Ltd. 

Rapid House 

Oxford Road 

High Wycombe 
Buckinghamshire HP11 2EE 
Tel: (44) (0494) 26271 

FAX: (44) (0494) 21860 
TWX: 837931 


Rapid Recall, Ltd. 
28 High Street 
Nantwich 

Cheshire CW5 5AS 
Tel: (0270) 627505 
FAX: (0270) 629883 
TWX: 36329 


_ WEST GERMANY 


Electronic 2000 AG 
Stahigruberring 12 
8000 Muenchen 82 
Tel: (49) 089/42001-0 
TLX: 522561 


ITT Multikomponent GmbH 
Postfach 1265 
Bahnhofstrasse 44 

7141 Moeglingen 

Tel: (49) 07141/4879 

TLX: 7264472 


Jermyn GmbH 

Im Dachsstueck 9 
6250 Limburg 

Tel: (49) 06431/508-0 
TLX: 415257-0 


Metrologie GmbH 
Meglingerstrasse 49 
8000 Muenchen 71 
Tel: (49) 089/78042-0 
TLX: 5213189 


Proelectron Vertriebs GmbH 
Max Planck Strasse 1-3 
6072 Dreieich 

Tel: (49) 06103/30434-3 
TLX: 417903 


YUGOSLAVIA . 


H.R. Microelectronics Corp. 
2005 de ja Cruz Bivd., Ste. 223 
ar Clara, CA 95050 

U.S.A. 


Tel: (1) (408) 988-0286 
TLX: 387 


Rapido Electronic Components 
S.p.a. 

Via C. Beccaria, 8 

a = Trieste 

Ital 

Tel: (39) 040/360555 

TLX: 460461 


in 


AUSTRALIA 


Intel Australia Pty. Ltd. 

Unit 13 . 
Allambie Grove Business Park 
25 Frenchs Forest Road East 
Frenchs Forest, NSW, 2086 
Tel: 61-2975-3300 

FAX: 61-2975-3375 


BRAZIL 


Intel Semicondutores do Brazil LTDA 
Av. Paulista, 1159-CJS 404/405 
01311 - Sao Paulo - S.P. 

Tel: 55-11-287-5899 

TLX: 3911153146 ISDB 

FAX: 55-11-287-5119 


CHINA/HONG KONG 


Intel-PRC Corporation 
15/F, Office 1, Citic Bldg. 
Jian Guo Men Wai Street 
Beijing, PRC 

Tel: (1) 500-4850 

TLX: 22947 INTEL CN 
FAX: (1) 500-2953 


Intel Semiconductor Ltd.* 
10/F East Tower 

Bond Center 
Queensway, Central 
Hong Kon 

Tel: 852) 844-4555 

FAX: (852) 868-1989 


INTERNATIONAL 


INDIA 


Intel Asia Electronics, Inc. 
4/2, Samrah Plaza 

St. Mark’s Road 

Bangalore 560001 

Tel: 011-91-812-215065 
TLX: 953-845-2646 INTL IN 
FAX: 091-812-215067 


JAPAN 


intel Japan K.K. 

5-6 Tokodai, Tsukuba-shi 
Ibaraki, 300-26 

Tel: 0298-47-8511 

TLX: 3656-160. 

FAX: 0298-47-8450 


Intel Japan K.K.* 
Daiichi Mitsugi Bidg. 
1-8889 Fuchu-cho 
Fuchu-shi, Tokyo 183 
Tel: 0423-60-7871 
FAX: 0423-60-0315 


Intel Japan K.K.* 

Bidg. Kumagaya 

2-69 Hon-cho 
Kumagaya-shi, Saitama 360 
Tel: 0485-24-6871 

FAX: 0485-24-7518 


SALES OFFICES 


Intel Japan K.K.* 


Kawa-asa Bidg. 

2-11-5 Shin-Yokohama 
Kohoku-ku, Yokohama-shi 
Kanagawa, 222 

Tel: 045-474-7661 

FAX: 045-471-4394 


Intel Japan K.K.* 
Ryokuchi-Eki Bldg. 

2-4-1 Terauchi 
Toyonaka-shi, Osaka 560 
Tel: 06-863-1091 

FAX: 06-863-1084 


Intel Japan K.K. 
Shinmaru Bldg. 

1-5-1 Marunouchi 
Chiyoda-ku, Tokyo 100 
Tel: 03-201-3621 

FAX: 03-201-6850 


Intel Japan K.K. 
Green Bldg. 

1-16-20 Nishiki 
Naka-ku, Nagoya-shi 
Aichi 450 

Tel: 052-204-1261 
FAX: 052-204-1285 


KOREA 


Intel Korea, Ltd. 


16th Floor, Life Bldg. 
61 Yoido-dong, Youngdeungpo-Ku 


- Seoul 150-010 


Tel: (2) 784-8186, 8286, 8386 
TLX: K29312 INTELKO 
FAX: (2) 784-8096 . 


SINGAPORE 


_ Intel Singapore Technology, Ltd. 


101 Thomson Road #21 -05/06 
United Square 

Singapore 1130 

Tel: 250-7811 

TLX: 39921 INTEL 

FAX: 250-9256 


TAIWAN 


Intel Technology Far East Ltd. 
8th Floor, No. 205 

Bank Tower Bldg. 

Tung Hua N. Road 

Taipei 

Tel: 886-2-716-9660 


_ FAX: 886-2-717-2455 


INTERNATIONAL DISTRIBUTORS/REPRESENTATIVES 


ARGENTINA 


Dafsys S.R.L. 
Chacabuco, 90-6 Piso 
1069- Buenos Aires 
Tel: 54-1-334-7726 
FAX: 54-1-334-1871 


AUSTRALIA 


Email Electronics 

15-17 Hume Street 
Huntingdale, 3166 

Tel: 011-61-3-544-8244 
TLX: AA 30895 

FAX: 011-61-3-543-8179 


NSD-Australia 

205 Middleborough Rd. » 
Box Hill, Victoria 3128 
Tel: 03 8900970 

FAX: 03 8990819 


BRAZIL 


Elebra Componentes 

Rua Geraldo Flausina Gomes, 78 
7 Andar 

04575 - Sao Paulo - S.P. 

Tel: 55-11-534-9641 

TLX: 55-11-54593/54591 

FAX: 55-11-534-9424 


CHINA/HONG KONG 


Novel Precision Machinery Co., Ltd. 
Room 728 Trade Square. 

681 Cheung Sha Wan. Road 
Kowloon, Hong Kong 

Tel: (852) 360-8999 — 

TWX: 32032 NVTNL HX 

FAX: (852) 725-3695 ~ 


INDIA 


Micronic Devices 

Arun Complex 

No. 65 D.V.G. Road 

Basavanagudi 

Bangalore 560 004 

Tel: 011-91-812-600-631 
011-91-812-611-365 

TLX: 9538458332 MDBG 


*Field Application Location 


Micronic Devices 

No. 516 5th Floor 

Swastik Chambers 

Sion, Trombay Road 

Chembur 

Bombay 400 071 

TLX: 9531 171447 MDEV . 


Micronic Devices 
25/8, 1st Floor 
Bada Bazaar Marg 
Old Rajinder Nagar 


-New Delhi 110 060 


Tel:-011-91-11-5723508 
011-91-11-589771 
TLX: 031-63253 MDND IN 


Micronic Devices: 

6-3-348/12A Dwarakapuri Colony 
Hyderabad 500 482 

Tel: 011-91-842-226748 


S&S Corporation 
1587 Kooser Road 
San Jose, CA 95118 
Tel: (408) 978-6216 


TLX: 820281 


FAX: (408) 978-8635 


JAPAN 
Asahi Electronics Co. Ltd. 


’ KMM Bldg. 2-14-1 Asano 


Kokurakita-ku 
Kitakyushu-shi 802 
Tel: 093-511-6471 
FAX: 093-551-7861 


CTC Components Systems Co., Ltd. 


4-8-1 Dobashi, Miyamae-ku 
Kawasaki-shi, Kanagawa 213 
Tel: 044-852-5121 

FAX: 044-877-4268 


Dia Semicon Systems, Inc. 

Flower Hill Shinmachi Higashi-kan 
1-23-9 Shinmachi, Setagaya-ku 
Tokyo 154 

Tel: 03-439-1600 

FAX: 03-439-1601 


Okaya Koki 

2-4-18 Sakae 

Naka-ku, Nagoya-shi 460 
Tel: 052-204-2916 

FAX: 052-204-2901 


Ryoyo Electro Corp. 
Konwa Bidg. 

1-12-22 Tsukiji 

Chuo-ku, Tokyo 104 

Tel: 03-546-5011 

FAX: 03-546-5044 


KOREA 


J-Tek Corporation 
Dong Sung Bidg. 9/F 


158-24, Samsung-Dong, Kangnam-Ku | 


Seoul 135- 090 


"Tel: (822) 557-8039 
-FAX: (822) 557-8304 


Samsung Electronics 

Samsung Main Bldg. 

150 Taepyung-Ro-2KA, Chung-Ku 
Seoul 100-102 

C.P.0O. Box 8780 


: Tet: (822) 751-3680 


TWX: KORSST K 27970 
FAX: (822) 753-9065 


MEXICO 


SSB Electronics, Inc. 

675 Palomar Street, Bidg. 4, Suite A 
Chula Vista, CA 92011 

Tel: (619) 585-3253 


’ TLX: 287751 CBALL UR 


FAX: (619) 585-8322 


Dicopel S.A. ; 
Tochtli 368 Fracc. Ind. San Antonio 
Azcapotzaico 


'C.P. 02760-Mexico, D.F. 


Tel: 52-5-561-3211 
TLX: 177 3790 Dicome 
FAX: 52-5-561-1279 


PHI S.A. de C.V. 

Fco. Villa esq. Ajusco s/n 
Cuernavaca — Morelos 
Tel: 52-73-13-9412 

FAX: 52-73-17-5333 


NEW ZEALAND 


Email Electronics 

36 Olive Road 
Penrose, Auckland 
Tel: 011-64-9-591-155 
FAX: 011-64-9-592-681 


SINGAPORE 


Electronic Resources Pte, Ltd. 
17 Harvey Road 

#03-01 Singapore 1336 

Tel: (65) 283-0888 . 

TWx: RS 56541 ERS © 

FAX: (65) 289-5327 


SOUTH AFRICA 


Electronic Building Elements 

178 Erasmus St. fof ff Watermeyet St.) 
Meyerspark, Pretoria, 0184 

Tel: 011-2712-803-7680 

FAX: 011-2712-803-8294 


TAIWAN 


Micro Electronics Sorporees 
12th Floor, Section 3 

285 Nanking East Road 
Taipei, R.O.C. 

Tel: (886) 2-7198419 

FAX: (886) 2-7197916 


Acer Sertek Inc. 

15th Floor, Section 2 
Chien Kuo North Rd. — 
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Microprocessors 


This year marks the 20-year anniversary 
of the invention of the microprocessor. 
This invention resulted in the mass 
proliferation of computing technology 
creating the microcomputer revolution of 
the 1980's. Intel has continued it’s 
technological lead with faster and more 
capable products for microcomputing. 

Intel offers an architecture that provides 
both the performance and the 
compatibility needed to take progress from 
one generation of products to the next. 

This handbook contains extensive 
information on Intel’s microprocessor 
families, numeric coprocessors, cache 
and memory controllers, and floppy and 
hard disk controllers. A development tools 
section is also included for the 8051, 
8096, 8086/186/188, 286, 386" and 
486" processors. 

The data sheets and application notes 
contained in this handbook offer 
comprehensive charts, diagrams, 
instructions and hardware information for 
leading-edge 32-bit system development. 
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